Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

TTS models

This document lists all text-to-speech models supported in sherpa-onnx.

Monolingual

The following table lists models by languages.

Mixed-lingual

The following lists models supporting multiple languages.

Chinese+English

This section lists text to speech models for Chinese+English.

matcha-icefall-zh-en

Info about this model

This model is trained using the code modified from https://github.com/k2-fsa/icefall/tree/master/egs/baker_zh/TTS/matcha

It is from https://modelscope.cn/models/dengcunqin/matcha_tts_zh_en_20251010

It supports Chinese and English.

Number of speakersSample rate
116000

Download the model

Click to expand

You need to download the acoustic model and the vocoder model.

Download the acoustic model

Please use the following code to download the model:

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-en.tar.bz2

tar xvf matcha-icefall-zh-en.tar.bz2
rm matcha-icefall-zh-en.tar.bz2

You should see the following output:

ls -lh matcha-icefall-zh-en/
total 168432
-rw-r--r--@   1 fangjun  staff    58K  4 Dec 14:29 date-zh.fst
drwxr-xr-x@ 122 fangjun  staff   3.8K 28 Nov  2023 espeak-ng-data
-rw-r--r--@   1 fangjun  staff   1.3M  4 Dec 14:29 lexicon.txt
-rw-r--r--@   1 fangjun  staff    72M  4 Dec 14:29 model-steps-3.onnx
-rw-r--r--@   1 fangjun  staff    63K  4 Dec 14:29 number-zh.fst
-rw-r--r--@   1 fangjun  staff    87K  4 Dec 14:29 phone-zh.fst
-rw-r--r--@   1 fangjun  staff   2.0K  4 Dec 14:29 README.md
-rw-r--r--@   1 fangjun  staff    21K  4 Dec 14:29 tokens.txt

Download the vocoder model

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-16khz-univ.onnx

You should see the following output

ls -lh vocos-16khz-univ.onnx

-rw-r--r--@ 1 fangjun  staff    51M  4 Dec 14:54 vocos-16khz-univ.onnx

Huggingface space

You can try this model by visiting https://huggingface.co/spaces/k2-fsa/text-to-speech

Huggingface space (WebAssembly, wasm)

You can try this model by visiting

https://huggingface.co/spaces/k2-fsa/web-assembly-zh-en-tts-matcha

The source code is available at https://github.com/k2-fsa/sherpa-onnx/tree/master/wasm/tts

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

The following code shows how to use the Python API of sherpa-onnx with this model.

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        matcha=sherpa_onnx.OfflineTtsMatchaModelConfig(
            acoustic_model="matcha-icefall-zh-en/model-steps-3.onnx",
            vocoder="vocos-16khz-univ.onnx",
            lexicon="matcha-icefall-zh-en/lexicon.txt",
            tokens="matcha-icefall-zh-en/tokens.txt",
            data_dir="matcha-icefall-zh-en/espeak-ng-data",
        ),
        num_threads=2,
        debug=True, # set it False to disable debug output
    ),
    max_num_sentences=1,
    rule_fsts="matcha-icefall-zh-en/phone-zh.fst,matcha-icefall-zh-en/date-zh.fst,matcha-icefall-zh-en/number-zh.fst",
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
text = "我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。"


audio = tts.generate(text, sid=0, speed=1.0)

sf.write(
    "./test.mp3",
    audio.samples,
    samplerate=audio.sample_rate,
)

You can save it as test_zh_en.py and then run:

pip install sherpa-onnx soundfile

python3 ./test_zh_en.py

You will get a file test.mp3 in the end.

C API

Click to expand

You can use the following code to play with matcha-icefall-zh-en using C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.matcha.acoustic_model = "matcha-icefall-zh-en/model-steps-3.onnx";
  config.model.matcha.vocoder = "vocos-16khz-univ.onnx";
  config.model.matcha.lexicon = "matcha-icefall-zh-en/lexicon.txt";
  config.model.matcha.tokens = "matcha-icefall-zh-en/tokens.txt";
  config.model.matcha.data_dir = "matcha-icefall-zh-en/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;
  config.rule_fsts = "matcha-icefall-zh-en/phone-zh.fst,matcha-icefall-zh-en/date-zh.fst,matcha-icefall-zh-en/number-zh.fst";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-zh-en.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-zh-en \
  /tmp/test-zh-en.c

Now you can run

cd /tmp

# Assume you have downloaded the acoustic model as well as the vocoder model and put them to /tmp
./test-zh-en

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-zh-en.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with matcha-icefall-zh-en using C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.matcha.acoustic_model = "matcha-icefall-zh-en/model-steps-3.onnx";
  config.model.matcha.vocoder = "vocos-16khz-univ.onnx";
  config.model.matcha.lexicon = "matcha-icefall-zh-en/lexicon.txt";
  config.model.matcha.tokens = "matcha-icefall-zh-en/tokens.txt";
  config.model.matcha.data_dir = "matcha-icefall-zh-en/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;
  config.rule_fsts = "matcha-icefall-zh-en/phone-zh.fst,matcha-icefall-zh-en/date-zh.fst,matcha-icefall-zh-en/number-zh.fst";

  std::string filename = "./test.wav";
  std::string text = "我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-zh-en.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-zh-en \
  /tmp/test-zh-en.cc

Now you can run

cd /tmp

# Assume you have downloaded the acoustic model as well as the vocoder model and put them to /tmp
./test-zh-en

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-zh-en.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with matcha-icefall-zh-en with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsMatchaModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            matcha: OfflineTtsMatchaModelConfig {
                acoustic_model: Some("matcha-icefall-zh-en/model-steps-3.onnx".into()),
                vocoder: Some("vocos-16khz-univ.onnx".into()),
                tokens: Some("matcha-icefall-zh-en/tokens.txt".into()),
                data_dir: Some("matcha-icefall-zh-en/espeak-ng-data".into()),
                lexicon: Some("matcha-icefall-zh-en/lexicon.txt".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        rule_fsts: Some("matcha-icefall-zh-en/phone-zh.fst,matcha-icefall-zh-en/date-zh.fst,matcha-icefall-zh-en/number-zh.fst".into()),
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with matcha-icefall-zh-en with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      matcha: {
        acousticModel: 'matcha-icefall-zh-en/model-steps-3.onnx',
        vocoder: 'vocos-16khz-univ.onnx',
        tokens: 'matcha-icefall-zh-en/tokens.txt',
        dataDir: 'matcha-icefall-zh-en/espeak-ng-data',
        lexicon: 'matcha-icefall-zh-en/lexicon.txt',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
    ruleFsts: 'matcha-icefall-zh-en/phone-zh.fst,matcha-icefall-zh-en/date-zh.fst,matcha-icefall-zh-en/number-zh.fst',
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = '我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with matcha-icefall-zh-en with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final matcha = sherpa_onnx.OfflineTtsMatchaModelConfig(
    acousticModel: 'matcha-icefall-zh-en/model-steps-3.onnx',
    vocoder: 'vocos-16khz-univ.onnx',
    tokens: 'matcha-icefall-zh-en/tokens.txt',
    dataDir: 'matcha-icefall-zh-en/espeak-ng-data',
    lexicon: 'matcha-icefall-zh-en/lexicon.txt',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    matcha: matcha,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: '我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with matcha-icefall-zh-en with Swift API.

func run() {
  let matcha = sherpaOnnxOfflineTtsMatchaModelConfig(
    acousticModel: "matcha-icefall-zh-en/model-steps-3.onnx",
    vocoder: "vocos-16khz-univ.onnx",
    tokens: "matcha-icefall-zh-en/tokens.txt",
    dataDir: "matcha-icefall-zh-en/espeak-ng-data",
    lexicon: "matcha-icefall-zh-en/lexicon.txt"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(matcha: matcha)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with matcha-icefall-zh-en with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Matcha.AcousticModel = "matcha-icefall-zh-en/model-steps-3.onnx";
config.Model.Matcha.Vocoder = "vocos-16khz-univ.onnx";
config.Model.Matcha.Tokens = "matcha-icefall-zh-en/tokens.txt";
config.Model.Matcha.DataDir = "matcha-icefall-zh-en/espeak-ng-data";
config.Model.Matcha.Lexicon = "matcha-icefall-zh-en/lexicon.txt";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.RuleFsts = "matcha-icefall-zh-en/phone-zh.fst,matcha-icefall-zh-en/date-zh.fst,matcha-icefall-zh-en/number-zh.fst";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with matcha-icefall-zh-en with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      matcha = OfflineTtsMatchaModelConfig(
        acousticModel = "matcha-icefall-zh-en/model-steps-3.onnx",
        vocoder = "vocos-16khz-univ.onnx",
        tokens = "matcha-icefall-zh-en/tokens.txt",
        dataDir = "matcha-icefall-zh-en/espeak-ng-data",
        lexicon = "matcha-icefall-zh-en/lexicon.txt",
      ),
      numThreads = 1,
      debug = true,
    ),
    ruleFsts = "matcha-icefall-zh-en/phone-zh.fst,matcha-icefall-zh-en/date-zh.fst,matcha-icefall-zh-en/number-zh.fst",
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with matcha-icefall-zh-en with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var matcha = new OfflineTtsMatchaModelConfig();
    matcha.setAcousticModel("matcha-icefall-zh-en/model-steps-3.onnx");
    matcha.setVocoder("vocos-16khz-univ.onnx");
    matcha.setTokens("matcha-icefall-zh-en/tokens.txt");
    matcha.setDataDir("matcha-icefall-zh-en/espeak-ng-data");
    matcha.setLexicon("matcha-icefall-zh-en/lexicon.txt");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setMatcha(matcha);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setRuleFsts("matcha-icefall-zh-en/phone-zh.fst,matcha-icefall-zh-en/date-zh.fst,matcha-icefall-zh-en/number-zh.fst");
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with matcha-icefall-zh-en with Pascal API.

program test_matcha;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Matcha.AcousticModel := 'matcha-icefall-zh-en/model-steps-3.onnx';
  Config.Model.Matcha.Vocoder := 'vocos-16khz-univ.onnx';
  Config.Model.Matcha.Tokens := 'matcha-icefall-zh-en/tokens.txt';
  Config.Model.Matcha.DataDir := 'matcha-icefall-zh-en/espeak-ng-data';
  Config.Model.Matcha.Lexicon := 'matcha-icefall-zh-en/lexicon.txt';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.RuleFsts := 'matcha-icefall-zh-en/phone-zh.fst,matcha-icefall-zh-en/date-zh.fst,matcha-icefall-zh-en/number-zh.fst';
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with matcha-icefall-zh-en with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Matcha: sherpa.OfflineTtsMatchaModelConfig{
				AcousticModel: "matcha-icefall-zh-en/model-steps-3.onnx",
				Vocoder:       "vocos-16khz-univ.onnx",
				Tokens:        "matcha-icefall-zh-en/tokens.txt",
				DataDir:       "matcha-icefall-zh-en/espeak-ng-data",
				Lexicon:       "matcha-icefall-zh-en/lexicon.txt",
			},
			NumThreads: 1,
			Debug:      true,
		},
		RuleFsts: "matcha-icefall-zh-en/phone-zh.fst,matcha-icefall-zh-en/date-zh.fst,matcha-icefall-zh-en/number-zh.fst",
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。

sample audios for different speakers are listed below:

Speaker 0

kokoro-multi-lang-v1_0

Info about this model

This model is kokoro v1.0 and it is from https://huggingface.co/hexgrad/Kokoro-82M

It supports both Chinese and English.

Number of speakersSample rate
5324000

Meaning of speaker prefix

PrefixMeaningsid rangeNumber of speakers
afAmerican female0 - 1011
amAmerican male11 - 199
bfBritish female20 - 234
bmBritish male24 - 274
efSpanish female281
emSpanish male291
ffFrench female301
hfHindi female31 - 322
hmHindi male33 - 342
ifItalian female351
imItalian male361
jfJapanese female37 - 404
jmJapanese male411
pfBrazilian Portuguese female421
pmBrazilian Portuguese male43 - 442
zfChinese female45 - 484
zmChinese male49 - 524

speaker ID to speaker name (sid -> name)

The mapping from speaker ID (sid) to speaker name is given below:

0 - 30 -> af_alloy1 -> af_aoede2 -> af_bella3 -> af_heart
4 - 74 -> af_jessica5 -> af_kore6 -> af_nicole7 -> af_nova
8 - 118 -> af_river9 -> af_sarah10 -> af_sky11 -> am_adam
12 - 1512 -> am_echo13 -> am_eric14 -> am_fenrir15 -> am_liam
16 - 1916 -> am_michael17 -> am_onyx18 -> am_puck19 -> am_santa
20 - 2320 -> bf_alice21 -> bf_emma22 -> bf_isabella23 -> bf_lily
24 - 2724 -> bm_daniel25 -> bm_fable26 -> bm_george27 -> bm_lewis
28 - 3128 -> ef_dora29 -> em_alex30 -> ff_siwis31 -> hf_alpha
32 - 3532 -> hf_beta33 -> hm_omega34 -> hm_psi35 -> if_sara
36 - 3936 -> im_nicola37 -> jf_alpha38 -> jf_gongitsune39 -> jf_nezumi
40 - 4340 -> jf_tebukuro41 -> jm_kumo42 -> pf_dora43 -> pm_alex
44 - 4744 -> pm_santa45 -> zf_xiaobei46 -> zf_xiaoni47 -> zf_xiaoxiao
48 - 5148 -> zf_xiaoyi49 -> zm_yunjian50 -> zm_yunxi51 -> zm_yunxia
5252 -> zm_yunyang

speaker name to speaker ID (name -> sid)

The mapping from speaker name to speaker ID (sid) is given below:

0 - 3af_alloy -> 0af_aoede -> 1af_bella -> 2af_heart -> 3
4 - 7af_jessica -> 4af_kore -> 5af_nicole -> 6af_nova -> 7
8 - 11af_river -> 8af_sarah -> 9af_sky -> 10am_adam -> 11
12 - 15am_echo -> 12am_eric -> 13am_fenrir -> 14am_liam -> 15
16 - 19am_michael -> 16am_onyx -> 17am_puck -> 18am_santa -> 19
20 - 23bf_alice -> 20bf_emma -> 21bf_isabella -> 22bf_lily -> 23
24 - 27bm_daniel -> 24bm_fable -> 25bm_george -> 26bm_lewis -> 27
28 - 31ef_dora -> 28em_alex -> 29ff_siwis -> 30hf_alpha -> 31
32 - 35hf_beta -> 32hm_omega -> 33hm_psi -> 34if_sara -> 35
36 - 39im_nicola -> 36jf_alpha -> 37jf_gongitsune -> 38jf_nezumi -> 39
40 - 43jf_tebukuro -> 40jm_kumo -> 41pf_dora -> 42pm_alex -> 43
44 - 47pm_santa -> 44zf_xiaobei -> 45zf_xiaoni -> 46zf_xiaoxiao -> 47
48 - 51zf_xiaoyi -> 48zm_yunjian -> 49zm_yunxi -> 50zm_yunxia -> 51
52 - 52zm_yunyang -> 52

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2

You can use the following code to play with kokoro-multi-lang-v1_0

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        kokoro=sherpa_onnx.OfflineTtsKokoroModelConfig(
            model="kokoro-multi-lang-v1_0/model.onnx",
            voices="kokoro-multi-lang-v1_0/voices.bin",
            tokens="kokoro-multi-lang-v1_0/tokens.txt",
            data_dir="kokoro-multi-lang-v1_0/espeak-ng-data",
            lexicon="kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_0 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.kokoro.model = "kokoro-multi-lang-v1_0/model.onnx";
  config.model.kokoro.voices = "kokoro-multi-lang-v1_0/voices.bin";
  config.model.kokoro.tokens = "kokoro-multi-lang-v1_0/tokens.txt";
  config.model.kokoro.data_dir = "kokoro-multi-lang-v1_0/espeak-ng-data";
  config.model.kokoro.lexicon = "kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt";

  config.model.num_threads = 1;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 0;

  const char *text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

#if 0
  // If you don't want to use a callback, then please enable this branch
  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
#else
  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);
#endif

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kokoro.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-kokoro \
  /tmp/test-kokoro.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kokoro

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kokoro.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_0 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.kokoro.model = "kokoro-multi-lang-v1_0/model.onnx";
  config.model.kokoro.voices = "kokoro-multi-lang-v1_0/voices.bin";
  config.model.kokoro.tokens = "kokoro-multi-lang-v1_0/tokens.txt";
  config.model.kokoro.data_dir = "kokoro-multi-lang-v1_0/espeak-ng-data";
  config.model.kokoro.lexicon = "kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt";

  config.model.num_threads = 1;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kokoro.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-kokoro \
  /tmp/test-kokoro.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kokoro

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kokoro.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_0 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKokoroModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            kokoro: OfflineTtsKokoroModelConfig {
                model: Some("kokoro-multi-lang-v1_0/model.onnx".into()),
                voices: Some("kokoro-multi-lang-v1_0/voices.bin".into()),
                tokens: Some("kokoro-multi-lang-v1_0/tokens.txt".into()),
                data_dir: Some("kokoro-multi-lang-v1_0/espeak-ng-data".into()),
                lexicon: Some("kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with kokoro-multi-lang-v1_0 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      kokoro: {
        model: 'kokoro-multi-lang-v1_0/model.onnx',
        voices: 'kokoro-multi-lang-v1_0/voices.bin',
        tokens: 'kokoro-multi-lang-v1_0/tokens.txt',
        dataDir: 'kokoro-multi-lang-v1_0/espeak-ng-data',
        lexicon: 'kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_0 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final kokoro = sherpa_onnx.OfflineTtsKokoroModelConfig(
    model: 'kokoro-multi-lang-v1_0/model.onnx',
    voices: 'kokoro-multi-lang-v1_0/voices.bin',
    tokens: 'kokoro-multi-lang-v1_0/tokens.txt',
    dataDir: 'kokoro-multi-lang-v1_0/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    kokoro: kokoro,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_0 with Swift API.

func run() {
  let kokoro = sherpaOnnxOfflineTtsKokoroModelConfig(
    model: "kokoro-multi-lang-v1_0/model.onnx",
    voices: "kokoro-multi-lang-v1_0/voices.bin",
    tokens: "kokoro-multi-lang-v1_0/tokens.txt",
    dataDir: "kokoro-multi-lang-v1_0/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(kokoro: kokoro)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_0 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Kokoro.Model = "kokoro-multi-lang-v1_0/model.onnx";
config.Model.Kokoro.Voices = "kokoro-multi-lang-v1_0/voices.bin";
config.Model.Kokoro.Tokens = "kokoro-multi-lang-v1_0/tokens.txt";
config.Model.Kokoro.DataDir = "kokoro-multi-lang-v1_0/espeak-ng-data";
config.Model.Kokoro.Lexicon = "kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = ;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_0 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      kokoro = OfflineTtsKokoroModelConfig(
        model = "kokoro-multi-lang-v1_0/model.onnx",
        voices = "kokoro-multi-lang-v1_0/voices.bin",
        tokens = "kokoro-multi-lang-v1_0/tokens.txt",
        dataDir = "kokoro-multi-lang-v1_0/espeak-ng-data",
        lexicon = "kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = ,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_0 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var kokoro = new OfflineTtsKokoroModelConfig();
    kokoro.setModel("kokoro-multi-lang-v1_0/model.onnx");
    kokoro.setVoices("kokoro-multi-lang-v1_0/voices.bin");
    kokoro.setTokens("kokoro-multi-lang-v1_0/tokens.txt");
    kokoro.setDataDir("kokoro-multi-lang-v1_0/espeak-ng-data");
    kokoro.setLexicon("kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setKokoro(kokoro);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_0 with Pascal API.

program test_kokoro;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Kokoro.Model := 'kokoro-multi-lang-v1_0/model.onnx';
  Config.Model.Kokoro.Voices := 'kokoro-multi-lang-v1_0/voices.bin';
  Config.Model.Kokoro.Tokens := 'kokoro-multi-lang-v1_0/tokens.txt';
  Config.Model.Kokoro.DataDir := 'kokoro-multi-lang-v1_0/espeak-ng-data';
  Config.Model.Kokoro.Lexicon := 'kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_0 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Kokoro: sherpa.OfflineTtsKokoroModelConfig{
				Model:  "kokoro-multi-lang-v1_0/model.onnx",
				Voices: "kokoro-multi-lang-v1_0/voices.bin",
				Tokens: "kokoro-multi-lang-v1_0/tokens.txt",
				DataDir: "kokoro-multi-lang-v1_0/espeak-ng-data",
				Lexicon: "kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

This model supports both Chinese and English. 小米的核心价值观是什么?答案
是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习.
我在研究 machine learning。What do you think 中英文说的如何呢?
今天是 2025年6月18号.

sample audios for different speakers are listed below:

Speaker 0 - af_alloy

Speaker 1 - af_aoede

Speaker 2 - af_bella

Speaker 3 - af_heart

Speaker 4 - af_jessica

Speaker 5 - af_kore

Speaker 6 - af_nicole

Speaker 7 - af_nova

Speaker 8 - af_river

Speaker 9 - af_sarah

Speaker 10 - af_sky

Speaker 11 - am_adam

Speaker 12 - am_echo

Speaker 13 - am_eric

Speaker 14 - am_fenrir

Speaker 15 - am_liam

Speaker 16 - am_michael

Speaker 17 - am_onyx

Speaker 18 - am_puck

Speaker 19 - am_santa

Speaker 20 - bf_alice

Speaker 21 - bf_emma

Speaker 22 - bf_isabella

Speaker 23 - bf_lily

Speaker 24 - bm_daniel

Speaker 25 - bm_fable

Speaker 26 - bm_george

Speaker 27 - bm_lewis

Speaker 28 - ef_dora

Speaker 29 - em_alex

Speaker 30 - ff_siwis

Speaker 31 - hf_alpha

Speaker 32 - hf_beta

Speaker 33 - hm_omega

Speaker 34 - hm_psi

Speaker 35 - if_sara

Speaker 36 - im_nicola

Speaker 37 - jf_alpha

Speaker 38 - jf_gongitsune

Speaker 39 - jf_nezumi

Speaker 40 - jf_tebukuro

Speaker 41 - jm_kumo

Speaker 42 - pf_dora

Speaker 43 - pm_alex

Speaker 44 - pm_santa

Speaker 45 - zf_xiaobei

Speaker 46 - zf_xiaoni

Speaker 47 - zf_xiaoxiao

Speaker 48 - zf_xiaoyi

Speaker 49 - zm_yunjian

Speaker 50 - zm_yunxi

Speaker 51 - zm_yunxia

Speaker 52 - zm_yunyang

kokoro-multi-lang-v1_1

Info about this model

This model is kokoro v1.1-zh and it is from https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh

It supports both Chinese and English.

Number of speakersSample rate
10324000

Meaning of speaker prefix

PrefixMeaningsid rangeNumber of speakers
afAmerican female0 - 12
bfBritish female21
zfChinese female3 - 5755
zmChinese male58 - 10245

speaker ID to speaker name (sid -> name)

The mapping from speaker ID (sid) to speaker name is given below:

0 - 30 -> af_maple1 -> af_sol2 -> bf_vale3 -> zf_001
4 - 74 -> zf_0025 -> zf_0036 -> zf_0047 -> zf_005
8 - 118 -> zf_0069 -> zf_00710 -> zf_00811 -> zf_017
12 - 1512 -> zf_01813 -> zf_01914 -> zf_02115 -> zf_022
16 - 1916 -> zf_02317 -> zf_02418 -> zf_02619 -> zf_027
20 - 2320 -> zf_02821 -> zf_03222 -> zf_03623 -> zf_038
24 - 2724 -> zf_03925 -> zf_04026 -> zf_04227 -> zf_043
28 - 3128 -> zf_04429 -> zf_04630 -> zf_04731 -> zf_048
32 - 3532 -> zf_04933 -> zf_05134 -> zf_05935 -> zf_060
36 - 3936 -> zf_06737 -> zf_07038 -> zf_07139 -> zf_072
40 - 4340 -> zf_07341 -> zf_07442 -> zf_07543 -> zf_076
44 - 4744 -> zf_07745 -> zf_07846 -> zf_07947 -> zf_083
48 - 5148 -> zf_08449 -> zf_08550 -> zf_08651 -> zf_087
52 - 5552 -> zf_08853 -> zf_09054 -> zf_09255 -> zf_093
56 - 5956 -> zf_09457 -> zf_09958 -> zm_00959 -> zm_010
60 - 6360 -> zm_01161 -> zm_01262 -> zm_01363 -> zm_014
64 - 6764 -> zm_01565 -> zm_01666 -> zm_02067 -> zm_025
68 - 7168 -> zm_02969 -> zm_03070 -> zm_03171 -> zm_033
72 - 7572 -> zm_03473 -> zm_03574 -> zm_03775 -> zm_041
76 - 7976 -> zm_04577 -> zm_05078 -> zm_05279 -> zm_053
80 - 8380 -> zm_05481 -> zm_05582 -> zm_05683 -> zm_057
84 - 8784 -> zm_05885 -> zm_06186 -> zm_06287 -> zm_063
88 - 9188 -> zm_06489 -> zm_06590 -> zm_06691 -> zm_068
92 - 9592 -> zm_06993 -> zm_08094 -> zm_08195 -> zm_082
96 - 9996 -> zm_08997 -> zm_09198 -> zm_09599 -> zm_096
100 - 102100 -> zm_097101 -> zm_098102 -> zm_100

speaker name to speaker ID (name -> sid)

The mapping from speaker name to speaker ID (sid) is given below:

0 - 3af_maple -> 0af_sol -> 1bf_vale -> 2zf_001 -> 3
4 - 7zf_002 -> 4zf_003 -> 5zf_004 -> 6zf_005 -> 7
8 - 11zf_006 -> 8zf_007 -> 9zf_008 -> 10zf_017 -> 11
12 - 15zf_018 -> 12zf_019 -> 13zf_021 -> 14zf_022 -> 15
16 - 19zf_023 -> 16zf_024 -> 17zf_026 -> 18zf_027 -> 19
20 - 23zf_028 -> 20zf_032 -> 21zf_036 -> 22zf_038 -> 23
24 - 27zf_039 -> 24zf_040 -> 25zf_042 -> 26zf_043 -> 27
28 - 31zf_044 -> 28zf_046 -> 29zf_047 -> 30zf_048 -> 31
32 - 35zf_049 -> 32zf_051 -> 33zf_059 -> 34zf_060 -> 35
36 - 39zf_067 -> 36zf_070 -> 37zf_071 -> 38zf_072 -> 39
40 - 43zf_073 -> 40zf_074 -> 41zf_075 -> 42zf_076 -> 43
44 - 47zf_077 -> 44zf_078 -> 45zf_079 -> 46zf_083 -> 47
48 - 51zf_084 -> 48zf_085 -> 49zf_086 -> 50zf_087 -> 51
52 - 55zf_088 -> 52zf_090 -> 53zf_092 -> 54zf_093 -> 55
56 - 59zf_094 -> 56zf_099 -> 57zm_009 -> 58zm_010 -> 59
60 - 63zm_011 -> 60zm_012 -> 61zm_013 -> 62zm_014 -> 63
64 - 67zm_015 -> 64zm_016 -> 65zm_020 -> 66zm_025 -> 67
68 - 71zm_029 -> 68zm_030 -> 69zm_031 -> 70zm_033 -> 71
72 - 75zm_034 -> 72zm_035 -> 73zm_037 -> 74zm_041 -> 75
76 - 79zm_045 -> 76zm_050 -> 77zm_052 -> 78zm_053 -> 79
80 - 83zm_054 -> 80zm_055 -> 81zm_056 -> 82zm_057 -> 83
84 - 87zm_058 -> 84zm_061 -> 85zm_062 -> 86zm_063 -> 87
88 - 91zm_064 -> 88zm_065 -> 89zm_066 -> 90zm_068 -> 91
92 - 95zm_069 -> 92zm_080 -> 93zm_081 -> 94zm_082 -> 95
96 - 99zm_089 -> 96zm_091 -> 97zm_095 -> 98zm_096 -> 99
100 - 102zm_097 -> 100zm_098 -> 101zm_100 -> 102

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_1.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_1.tar.bz2

You can use the following code to play with kokoro-multi-lang-v1_1

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        kokoro=sherpa_onnx.OfflineTtsKokoroModelConfig(
            model="kokoro-multi-lang-v1_1/model.onnx",
            voices="kokoro-multi-lang-v1_1/voices.bin",
            tokens="kokoro-multi-lang-v1_1/tokens.txt",
            data_dir="kokoro-multi-lang-v1_1/espeak-ng-data",
            lexicon="kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_1 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.kokoro.model = "kokoro-multi-lang-v1_1/model.onnx";
  config.model.kokoro.voices = "kokoro-multi-lang-v1_1/voices.bin";
  config.model.kokoro.tokens = "kokoro-multi-lang-v1_1/tokens.txt";
  config.model.kokoro.data_dir = "kokoro-multi-lang-v1_1/espeak-ng-data";
  config.model.kokoro.lexicon = "kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt";

  config.model.num_threads = 1;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 0;

  const char *text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

#if 0
  // If you don't want to use a callback, then please enable this branch
  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
#else
  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);
#endif

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kokoro.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-kokoro \
  /tmp/test-kokoro.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kokoro

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kokoro.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_1 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.kokoro.model = "kokoro-multi-lang-v1_1/model.onnx";
  config.model.kokoro.voices = "kokoro-multi-lang-v1_1/voices.bin";
  config.model.kokoro.tokens = "kokoro-multi-lang-v1_1/tokens.txt";
  config.model.kokoro.data_dir = "kokoro-multi-lang-v1_1/espeak-ng-data";
  config.model.kokoro.lexicon = "kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt";

  config.model.num_threads = 1;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kokoro.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-kokoro \
  /tmp/test-kokoro.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kokoro

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kokoro.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_1 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKokoroModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            kokoro: OfflineTtsKokoroModelConfig {
                model: Some("kokoro-multi-lang-v1_1/model.onnx".into()),
                voices: Some("kokoro-multi-lang-v1_1/voices.bin".into()),
                tokens: Some("kokoro-multi-lang-v1_1/tokens.txt".into()),
                data_dir: Some("kokoro-multi-lang-v1_1/espeak-ng-data".into()),
                lexicon: Some("kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with kokoro-multi-lang-v1_1 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      kokoro: {
        model: 'kokoro-multi-lang-v1_1/model.onnx',
        voices: 'kokoro-multi-lang-v1_1/voices.bin',
        tokens: 'kokoro-multi-lang-v1_1/tokens.txt',
        dataDir: 'kokoro-multi-lang-v1_1/espeak-ng-data',
        lexicon: 'kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_1 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final kokoro = sherpa_onnx.OfflineTtsKokoroModelConfig(
    model: 'kokoro-multi-lang-v1_1/model.onnx',
    voices: 'kokoro-multi-lang-v1_1/voices.bin',
    tokens: 'kokoro-multi-lang-v1_1/tokens.txt',
    dataDir: 'kokoro-multi-lang-v1_1/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    kokoro: kokoro,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_1 with Swift API.

func run() {
  let kokoro = sherpaOnnxOfflineTtsKokoroModelConfig(
    model: "kokoro-multi-lang-v1_1/model.onnx",
    voices: "kokoro-multi-lang-v1_1/voices.bin",
    tokens: "kokoro-multi-lang-v1_1/tokens.txt",
    dataDir: "kokoro-multi-lang-v1_1/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(kokoro: kokoro)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_1 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Kokoro.Model = "kokoro-multi-lang-v1_1/model.onnx";
config.Model.Kokoro.Voices = "kokoro-multi-lang-v1_1/voices.bin";
config.Model.Kokoro.Tokens = "kokoro-multi-lang-v1_1/tokens.txt";
config.Model.Kokoro.DataDir = "kokoro-multi-lang-v1_1/espeak-ng-data";
config.Model.Kokoro.Lexicon = "kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = ;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_1 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      kokoro = OfflineTtsKokoroModelConfig(
        model = "kokoro-multi-lang-v1_1/model.onnx",
        voices = "kokoro-multi-lang-v1_1/voices.bin",
        tokens = "kokoro-multi-lang-v1_1/tokens.txt",
        dataDir = "kokoro-multi-lang-v1_1/espeak-ng-data",
        lexicon = "kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = ,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_1 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var kokoro = new OfflineTtsKokoroModelConfig();
    kokoro.setModel("kokoro-multi-lang-v1_1/model.onnx");
    kokoro.setVoices("kokoro-multi-lang-v1_1/voices.bin");
    kokoro.setTokens("kokoro-multi-lang-v1_1/tokens.txt");
    kokoro.setDataDir("kokoro-multi-lang-v1_1/espeak-ng-data");
    kokoro.setLexicon("kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setKokoro(kokoro);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_1 with Pascal API.

program test_kokoro;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Kokoro.Model := 'kokoro-multi-lang-v1_1/model.onnx';
  Config.Model.Kokoro.Voices := 'kokoro-multi-lang-v1_1/voices.bin';
  Config.Model.Kokoro.Tokens := 'kokoro-multi-lang-v1_1/tokens.txt';
  Config.Model.Kokoro.DataDir := 'kokoro-multi-lang-v1_1/espeak-ng-data';
  Config.Model.Kokoro.Lexicon := 'kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_1 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Kokoro: sherpa.OfflineTtsKokoroModelConfig{
				Model:  "kokoro-multi-lang-v1_1/model.onnx",
				Voices: "kokoro-multi-lang-v1_1/voices.bin",
				Tokens: "kokoro-multi-lang-v1_1/tokens.txt",
				DataDir: "kokoro-multi-lang-v1_1/espeak-ng-data",
				Lexicon: "kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

This model supports both Chinese and English. 小米的核心价值观是什么?答案
是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习.
我在研究 machine learning。What do you think 中英文说的如何呢?
今天是 2025年6月18号.

sample audios for different speakers are listed below:

Speaker 0 - af_maple

Speaker 1 - af_sol

Speaker 2 - bf_vale

Speaker 3 - zf_001

Speaker 4 - zf_002

Speaker 5 - zf_003

Speaker 6 - zf_004

Speaker 7 - zf_005

Speaker 8 - zf_006

Speaker 9 - zf_007

Speaker 10 - zf_008

Speaker 11 - zf_017

Speaker 12 - zf_018

Speaker 13 - zf_019

Speaker 14 - zf_021

Speaker 15 - zf_022

Speaker 16 - zf_023

Speaker 17 - zf_024

Speaker 18 - zf_026

Speaker 19 - zf_027

Speaker 20 - zf_028

Speaker 21 - zf_032

Speaker 22 - zf_036

Speaker 23 - zf_038

Speaker 24 - zf_039

Speaker 25 - zf_040

Speaker 26 - zf_042

Speaker 27 - zf_043

Speaker 28 - zf_044

Speaker 29 - zf_046

Speaker 30 - zf_047

Speaker 31 - zf_048

Speaker 32 - zf_049

Speaker 33 - zf_051

Speaker 34 - zf_059

Speaker 35 - zf_060

Speaker 36 - zf_067

Speaker 37 - zf_070

Speaker 38 - zf_071

Speaker 39 - zf_072

Speaker 40 - zf_073

Speaker 41 - zf_074

Speaker 42 - zf_075

Speaker 43 - zf_076

Speaker 44 - zf_077

Speaker 45 - zf_078

Speaker 46 - zf_079

Speaker 47 - zf_083

Speaker 48 - zf_084

Speaker 49 - zf_085

Speaker 50 - zf_086

Speaker 51 - zf_087

Speaker 52 - zf_088

Speaker 53 - zf_090

Speaker 54 - zf_092

Speaker 55 - zf_093

Speaker 56 - zf_094

Speaker 57 - zf_099

Speaker 58 - zm_009

Speaker 59 - zm_010

Speaker 60 - zm_011

Speaker 61 - zm_012

Speaker 62 - zm_013

Speaker 63 - zm_014

Speaker 64 - zm_015

Speaker 65 - zm_016

Speaker 66 - zm_020

Speaker 67 - zm_025

Speaker 68 - zm_029

Speaker 69 - zm_030

Speaker 70 - zm_031

Speaker 71 - zm_033

Speaker 72 - zm_034

Speaker 73 - zm_035

Speaker 74 - zm_037

Speaker 75 - zm_041

Speaker 76 - zm_045

Speaker 77 - zm_050

Speaker 78 - zm_052

Speaker 79 - zm_053

Speaker 80 - zm_054

Speaker 81 - zm_055

Speaker 82 - zm_056

Speaker 83 - zm_057

Speaker 84 - zm_058

Speaker 85 - zm_061

Speaker 86 - zm_062

Speaker 87 - zm_063

Speaker 88 - zm_064

Speaker 89 - zm_065

Speaker 90 - zm_066

Speaker 91 - zm_068

Speaker 92 - zm_069

Speaker 93 - zm_080

Speaker 94 - zm_081

Speaker 95 - zm_082

Speaker 96 - zm_089

Speaker 97 - zm_091

Speaker 98 - zm_095

Speaker 99 - zm_096

Speaker 100 - zm_097

Speaker 101 - zm_098

Speaker 102 - zm_100

Arabic

This section lists text to speech models for Arabic.

vits-piper-ar_JO-SA_dii-high

Info about this model

This model is converted from https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_dii_espeak

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ar_JO-SA_dii-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_dii-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx";
  config.model.vits.tokens = "vits-piper-ar_JO-SA_dii-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ar_JO-SA_dii-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "كيف حالك اليوم؟";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ar_JO-SA_dii-high.tar.bz2

You can use the following code to play with vits-piper-ar_JO-SA_dii-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx",
            data_dir="vits-piper-ar_JO-SA_dii-high/espeak-ng-data",
            tokens="vits-piper-ar_JO-SA_dii-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="كيف حالك اليوم؟",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_dii-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx";
  config.model.vits.tokens = "vits-piper-ar_JO-SA_dii-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ar_JO-SA_dii-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "كيف حالك اليوم؟";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_dii-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx".into()),
                tokens: Some("vits-piper-ar_JO-SA_dii-high/tokens.txt".into()),
                data_dir: Some("vits-piper-ar_JO-SA_dii-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "كيف حالك اليوم؟";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-ar_JO-SA_dii-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx',
        tokens: 'vits-piper-ar_JO-SA_dii-high/tokens.txt',
        dataDir: 'vits-piper-ar_JO-SA_dii-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'كيف حالك اليوم؟';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_dii-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx',
    tokens: 'vits-piper-ar_JO-SA_dii-high/tokens.txt',
    dataDir: 'vits-piper-ar_JO-SA_dii-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'كيف حالك اليوم؟', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_dii-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx",
    lexicon: "",
    tokens: "vits-piper-ar_JO-SA_dii-high/tokens.txt",
    dataDir: "vits-piper-ar_JO-SA_dii-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "كيف حالك اليوم؟"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_dii-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx";
config.Model.Vits.Tokens = "vits-piper-ar_JO-SA_dii-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ar_JO-SA_dii-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "كيف حالك اليوم؟";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_dii-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx",
        tokens = "vits-piper-ar_JO-SA_dii-high/tokens.txt",
        dataDir = "vits-piper-ar_JO-SA_dii-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "كيف حالك اليوم؟",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_dii-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx");
    vits.setTokens("vits-piper-ar_JO-SA_dii-high/tokens.txt");
    vits.setDataDir("vits-piper-ar_JO-SA_dii-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "كيف حالك اليوم؟";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_dii-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-ar_JO-SA_dii-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-ar_JO-SA_dii-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('كيف حالك اليوم؟', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_dii-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx",
				Tokens:  "vits-piper-ar_JO-SA_dii-high/tokens.txt",
				DataDir: "vits-piper-ar_JO-SA_dii-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "كيف حالك اليوم؟"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

كيف حالك اليوم؟

sample audios for different speakers are listed below:

Speaker 0

vits-piper-ar_JO-SA_miro-high

Info about this model

This model is converted from https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_miro_espeak

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ar_JO-SA_miro-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_miro-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx";
  config.model.vits.tokens = "vits-piper-ar_JO-SA_miro-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ar_JO-SA_miro-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "كيف حالك اليوم؟";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ar_JO-SA_miro-high.tar.bz2

You can use the following code to play with vits-piper-ar_JO-SA_miro-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx",
            data_dir="vits-piper-ar_JO-SA_miro-high/espeak-ng-data",
            tokens="vits-piper-ar_JO-SA_miro-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="كيف حالك اليوم؟",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_miro-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx";
  config.model.vits.tokens = "vits-piper-ar_JO-SA_miro-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ar_JO-SA_miro-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "كيف حالك اليوم؟";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_miro-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx".into()),
                tokens: Some("vits-piper-ar_JO-SA_miro-high/tokens.txt".into()),
                data_dir: Some("vits-piper-ar_JO-SA_miro-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "كيف حالك اليوم؟";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-ar_JO-SA_miro-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx',
        tokens: 'vits-piper-ar_JO-SA_miro-high/tokens.txt',
        dataDir: 'vits-piper-ar_JO-SA_miro-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'كيف حالك اليوم؟';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_miro-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx',
    tokens: 'vits-piper-ar_JO-SA_miro-high/tokens.txt',
    dataDir: 'vits-piper-ar_JO-SA_miro-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'كيف حالك اليوم؟', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_miro-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx",
    lexicon: "",
    tokens: "vits-piper-ar_JO-SA_miro-high/tokens.txt",
    dataDir: "vits-piper-ar_JO-SA_miro-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "كيف حالك اليوم؟"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_miro-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx";
config.Model.Vits.Tokens = "vits-piper-ar_JO-SA_miro-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ar_JO-SA_miro-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "كيف حالك اليوم؟";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_miro-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx",
        tokens = "vits-piper-ar_JO-SA_miro-high/tokens.txt",
        dataDir = "vits-piper-ar_JO-SA_miro-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "كيف حالك اليوم؟",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_miro-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx");
    vits.setTokens("vits-piper-ar_JO-SA_miro-high/tokens.txt");
    vits.setDataDir("vits-piper-ar_JO-SA_miro-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "كيف حالك اليوم؟";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_miro-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-ar_JO-SA_miro-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-ar_JO-SA_miro-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('كيف حالك اليوم؟', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_miro-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx",
				Tokens:  "vits-piper-ar_JO-SA_miro-high/tokens.txt",
				DataDir: "vits-piper-ar_JO-SA_miro-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "كيف حالك اليوم؟"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

كيف حالك اليوم؟

sample audios for different speakers are listed below:

Speaker 0

vits-piper-ar_JO-SA_miro_V2-high

Info about this model

This model is converted from https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_miro_espeak_V2

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ar_JO-SA_miro_V2-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx";
  config.model.vits.tokens = "vits-piper-ar_JO-SA_miro_V2-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "كيف حالك اليوم؟";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ar_JO-SA_miro_V2-high.tar.bz2

You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx",
            data_dir="vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data",
            tokens="vits-piper-ar_JO-SA_miro_V2-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="كيف حالك اليوم؟",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx";
  config.model.vits.tokens = "vits-piper-ar_JO-SA_miro_V2-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "كيف حالك اليوم؟";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx".into()),
                tokens: Some("vits-piper-ar_JO-SA_miro_V2-high/tokens.txt".into()),
                data_dir: Some("vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "كيف حالك اليوم؟";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx',
        tokens: 'vits-piper-ar_JO-SA_miro_V2-high/tokens.txt',
        dataDir: 'vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'كيف حالك اليوم؟';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx',
    tokens: 'vits-piper-ar_JO-SA_miro_V2-high/tokens.txt',
    dataDir: 'vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'كيف حالك اليوم؟', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx",
    lexicon: "",
    tokens: "vits-piper-ar_JO-SA_miro_V2-high/tokens.txt",
    dataDir: "vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "كيف حالك اليوم؟"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx";
config.Model.Vits.Tokens = "vits-piper-ar_JO-SA_miro_V2-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "كيف حالك اليوم؟";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx",
        tokens = "vits-piper-ar_JO-SA_miro_V2-high/tokens.txt",
        dataDir = "vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "كيف حالك اليوم؟",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx");
    vits.setTokens("vits-piper-ar_JO-SA_miro_V2-high/tokens.txt");
    vits.setDataDir("vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "كيف حالك اليوم؟";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-ar_JO-SA_miro_V2-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('كيف حالك اليوم؟', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx",
				Tokens:  "vits-piper-ar_JO-SA_miro_V2-high/tokens.txt",
				DataDir: "vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "كيف حالك اليوم؟"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

كيف حالك اليوم؟

sample audios for different speakers are listed below:

Speaker 0

vits-piper-ar_JO-kareem-low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ar/ar_JO/kareem/low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ar_JO-kareem-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-ar_JO-kareem-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx";
  config.model.vits.tokens = "vits-piper-ar_JO-kareem-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ar_JO-kareem-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "كيف حالك اليوم؟";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ar_JO-kareem-low.tar.bz2

You can use the following code to play with vits-piper-ar_JO-kareem-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx",
            data_dir="vits-piper-ar_JO-kareem-low/espeak-ng-data",
            tokens="vits-piper-ar_JO-kareem-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="كيف حالك اليوم؟",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-ar_JO-kareem-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx";
  config.model.vits.tokens = "vits-piper-ar_JO-kareem-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ar_JO-kareem-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "كيف حالك اليوم؟";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-ar_JO-kareem-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx".into()),
                tokens: Some("vits-piper-ar_JO-kareem-low/tokens.txt".into()),
                data_dir: Some("vits-piper-ar_JO-kareem-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "كيف حالك اليوم؟";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-ar_JO-kareem-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx',
        tokens: 'vits-piper-ar_JO-kareem-low/tokens.txt',
        dataDir: 'vits-piper-ar_JO-kareem-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'كيف حالك اليوم؟';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-ar_JO-kareem-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx',
    tokens: 'vits-piper-ar_JO-kareem-low/tokens.txt',
    dataDir: 'vits-piper-ar_JO-kareem-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'كيف حالك اليوم؟', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-ar_JO-kareem-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx",
    lexicon: "",
    tokens: "vits-piper-ar_JO-kareem-low/tokens.txt",
    dataDir: "vits-piper-ar_JO-kareem-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "كيف حالك اليوم؟"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-ar_JO-kareem-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx";
config.Model.Vits.Tokens = "vits-piper-ar_JO-kareem-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ar_JO-kareem-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "كيف حالك اليوم؟";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-ar_JO-kareem-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx",
        tokens = "vits-piper-ar_JO-kareem-low/tokens.txt",
        dataDir = "vits-piper-ar_JO-kareem-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "كيف حالك اليوم؟",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-ar_JO-kareem-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx");
    vits.setTokens("vits-piper-ar_JO-kareem-low/tokens.txt");
    vits.setDataDir("vits-piper-ar_JO-kareem-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "كيف حالك اليوم؟";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-ar_JO-kareem-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-ar_JO-kareem-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-ar_JO-kareem-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('كيف حالك اليوم؟', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-ar_JO-kareem-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx",
				Tokens:  "vits-piper-ar_JO-kareem-low/tokens.txt",
				DataDir: "vits-piper-ar_JO-kareem-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "كيف حالك اليوم؟"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

كيف حالك اليوم؟

sample audios for different speakers are listed below:

Speaker 0

vits-piper-ar_JO-kareem-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ar/ar_JO/kareem/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ar_JO-kareem-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-ar_JO-kareem-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx";
  config.model.vits.tokens = "vits-piper-ar_JO-kareem-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ar_JO-kareem-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "كيف حالك اليوم؟";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ar_JO-kareem-medium.tar.bz2

You can use the following code to play with vits-piper-ar_JO-kareem-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx",
            data_dir="vits-piper-ar_JO-kareem-medium/espeak-ng-data",
            tokens="vits-piper-ar_JO-kareem-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="كيف حالك اليوم؟",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-ar_JO-kareem-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx";
  config.model.vits.tokens = "vits-piper-ar_JO-kareem-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ar_JO-kareem-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "كيف حالك اليوم؟";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-ar_JO-kareem-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx".into()),
                tokens: Some("vits-piper-ar_JO-kareem-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-ar_JO-kareem-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "كيف حالك اليوم؟";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-ar_JO-kareem-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx',
        tokens: 'vits-piper-ar_JO-kareem-medium/tokens.txt',
        dataDir: 'vits-piper-ar_JO-kareem-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'كيف حالك اليوم؟';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-ar_JO-kareem-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx',
    tokens: 'vits-piper-ar_JO-kareem-medium/tokens.txt',
    dataDir: 'vits-piper-ar_JO-kareem-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'كيف حالك اليوم؟', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-ar_JO-kareem-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-ar_JO-kareem-medium/tokens.txt",
    dataDir: "vits-piper-ar_JO-kareem-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "كيف حالك اليوم؟"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-ar_JO-kareem-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ar_JO-kareem-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ar_JO-kareem-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "كيف حالك اليوم؟";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-ar_JO-kareem-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx",
        tokens = "vits-piper-ar_JO-kareem-medium/tokens.txt",
        dataDir = "vits-piper-ar_JO-kareem-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "كيف حالك اليوم؟",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-ar_JO-kareem-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx");
    vits.setTokens("vits-piper-ar_JO-kareem-medium/tokens.txt");
    vits.setDataDir("vits-piper-ar_JO-kareem-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "كيف حالك اليوم؟";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-ar_JO-kareem-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-ar_JO-kareem-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-ar_JO-kareem-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('كيف حالك اليوم؟', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-ar_JO-kareem-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx",
				Tokens:  "vits-piper-ar_JO-kareem-medium/tokens.txt",
				DataDir: "vits-piper-ar_JO-kareem-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "كيف حالك اليوم؟"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

كيف حالك اليوم؟

sample audios for different speakers are listed below:

Speaker 0

supertonic-3-ar

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Arabic (ar).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "ar"

audio = tts.generate("هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"ar\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "ar"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "ar"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'ar'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'ar'},
  );
  final audio = tts.generateWithConfig(text: 'هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "ar"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"ar\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "ar"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"ar\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "ar"}';

  Audio := Tts.GenerateWithConfig('هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "ar"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

مرحبا بالعالم.

1

كيف حالك اليوم؟

2

السماء زرقاء والهواء لطيف.

3

يساعد التعلم الآلي الحواسيب على فهم البيانات.

4

تحول تقنية تحويل النص إلى كلام الجمل إلى صوت واضح.

5

قرأ الطلاب قصة قصيرة في المكتبة صباحا.

6

أعلن القطار عن تأخير بسيط بسبب أعمال الصيانة.

7

تعمل النماذج الصغيرة بسرعة على الأجهزة المحلية.

8

يساعد المساعد الصوتي المستخدمين في المهام اليومية.

9

تحتاج الأنظمة الحديثة إلى قراءة مستقرة للنصوص الطويلة.

Speaker 1

0

مرحبا بالعالم.

1

كيف حالك اليوم؟

2

السماء زرقاء والهواء لطيف.

3

يساعد التعلم الآلي الحواسيب على فهم البيانات.

4

تحول تقنية تحويل النص إلى كلام الجمل إلى صوت واضح.

5

قرأ الطلاب قصة قصيرة في المكتبة صباحا.

6

أعلن القطار عن تأخير بسيط بسبب أعمال الصيانة.

7

تعمل النماذج الصغيرة بسرعة على الأجهزة المحلية.

8

يساعد المساعد الصوتي المستخدمين في المهام اليومية.

9

تحتاج الأنظمة الحديثة إلى قراءة مستقرة للنصوص الطويلة.

Speaker 2

0

مرحبا بالعالم.

1

كيف حالك اليوم؟

2

السماء زرقاء والهواء لطيف.

3

يساعد التعلم الآلي الحواسيب على فهم البيانات.

4

تحول تقنية تحويل النص إلى كلام الجمل إلى صوت واضح.

5

قرأ الطلاب قصة قصيرة في المكتبة صباحا.

6

أعلن القطار عن تأخير بسيط بسبب أعمال الصيانة.

7

تعمل النماذج الصغيرة بسرعة على الأجهزة المحلية.

8

يساعد المساعد الصوتي المستخدمين في المهام اليومية.

9

تحتاج الأنظمة الحديثة إلى قراءة مستقرة للنصوص الطويلة.

Speaker 3

0

مرحبا بالعالم.

1

كيف حالك اليوم؟

2

السماء زرقاء والهواء لطيف.

3

يساعد التعلم الآلي الحواسيب على فهم البيانات.

4

تحول تقنية تحويل النص إلى كلام الجمل إلى صوت واضح.

5

قرأ الطلاب قصة قصيرة في المكتبة صباحا.

6

أعلن القطار عن تأخير بسيط بسبب أعمال الصيانة.

7

تعمل النماذج الصغيرة بسرعة على الأجهزة المحلية.

8

يساعد المساعد الصوتي المستخدمين في المهام اليومية.

9

تحتاج الأنظمة الحديثة إلى قراءة مستقرة للنصوص الطويلة.

Speaker 4

0

مرحبا بالعالم.

1

كيف حالك اليوم؟

2

السماء زرقاء والهواء لطيف.

3

يساعد التعلم الآلي الحواسيب على فهم البيانات.

4

تحول تقنية تحويل النص إلى كلام الجمل إلى صوت واضح.

5

قرأ الطلاب قصة قصيرة في المكتبة صباحا.

6

أعلن القطار عن تأخير بسيط بسبب أعمال الصيانة.

7

تعمل النماذج الصغيرة بسرعة على الأجهزة المحلية.

8

يساعد المساعد الصوتي المستخدمين في المهام اليومية.

9

تحتاج الأنظمة الحديثة إلى قراءة مستقرة للنصوص الطويلة.

Speaker 5

0

مرحبا بالعالم.

1

كيف حالك اليوم؟

2

السماء زرقاء والهواء لطيف.

3

يساعد التعلم الآلي الحواسيب على فهم البيانات.

4

تحول تقنية تحويل النص إلى كلام الجمل إلى صوت واضح.

5

قرأ الطلاب قصة قصيرة في المكتبة صباحا.

6

أعلن القطار عن تأخير بسيط بسبب أعمال الصيانة.

7

تعمل النماذج الصغيرة بسرعة على الأجهزة المحلية.

8

يساعد المساعد الصوتي المستخدمين في المهام اليومية.

9

تحتاج الأنظمة الحديثة إلى قراءة مستقرة للنصوص الطويلة.

Speaker 6

0

مرحبا بالعالم.

1

كيف حالك اليوم؟

2

السماء زرقاء والهواء لطيف.

3

يساعد التعلم الآلي الحواسيب على فهم البيانات.

4

تحول تقنية تحويل النص إلى كلام الجمل إلى صوت واضح.

5

قرأ الطلاب قصة قصيرة في المكتبة صباحا.

6

أعلن القطار عن تأخير بسيط بسبب أعمال الصيانة.

7

تعمل النماذج الصغيرة بسرعة على الأجهزة المحلية.

8

يساعد المساعد الصوتي المستخدمين في المهام اليومية.

9

تحتاج الأنظمة الحديثة إلى قراءة مستقرة للنصوص الطويلة.

Speaker 7

0

مرحبا بالعالم.

1

كيف حالك اليوم؟

2

السماء زرقاء والهواء لطيف.

3

يساعد التعلم الآلي الحواسيب على فهم البيانات.

4

تحول تقنية تحويل النص إلى كلام الجمل إلى صوت واضح.

5

قرأ الطلاب قصة قصيرة في المكتبة صباحا.

6

أعلن القطار عن تأخير بسيط بسبب أعمال الصيانة.

7

تعمل النماذج الصغيرة بسرعة على الأجهزة المحلية.

8

يساعد المساعد الصوتي المستخدمين في المهام اليومية.

9

تحتاج الأنظمة الحديثة إلى قراءة مستقرة للنصوص الطويلة.

Speaker 8

0

مرحبا بالعالم.

1

كيف حالك اليوم؟

2

السماء زرقاء والهواء لطيف.

3

يساعد التعلم الآلي الحواسيب على فهم البيانات.

4

تحول تقنية تحويل النص إلى كلام الجمل إلى صوت واضح.

5

قرأ الطلاب قصة قصيرة في المكتبة صباحا.

6

أعلن القطار عن تأخير بسيط بسبب أعمال الصيانة.

7

تعمل النماذج الصغيرة بسرعة على الأجهزة المحلية.

8

يساعد المساعد الصوتي المستخدمين في المهام اليومية.

9

تحتاج الأنظمة الحديثة إلى قراءة مستقرة للنصوص الطويلة.

Speaker 9

0

مرحبا بالعالم.

1

كيف حالك اليوم؟

2

السماء زرقاء والهواء لطيف.

3

يساعد التعلم الآلي الحواسيب على فهم البيانات.

4

تحول تقنية تحويل النص إلى كلام الجمل إلى صوت واضح.

5

قرأ الطلاب قصة قصيرة في المكتبة صباحا.

6

أعلن القطار عن تأخير بسيط بسبب أعمال الصيانة.

7

تعمل النماذج الصغيرة بسرعة على الأجهزة المحلية.

8

يساعد المساعد الصوتي المستخدمين في المهام اليومية.

9

تحتاج الأنظمة الحديثة إلى قراءة مستقرة للنصوص الطويلة.

Albanian

This section lists text to speech models for Albanian.

vits-piper-sq_AL-edon-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/sq/sq_AL/edon/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-sq_AL-edon-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-sq_AL-edon-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx";
  config.model.vits.tokens = "vits-piper-sq_AL-edon-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-sq_AL-edon-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Çdo fillim është i vështirë, por çdo fund është i bukur.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-sq_AL-edon-medium.tar.bz2

You can use the following code to play with vits-piper-sq_AL-edon-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx",
            data_dir="vits-piper-sq_AL-edon-medium/espeak-ng-data",
            tokens="vits-piper-sq_AL-edon-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Çdo fillim është i vështirë, por çdo fund është i bukur.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-sq_AL-edon-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx";
  config.model.vits.tokens = "vits-piper-sq_AL-edon-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-sq_AL-edon-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Çdo fillim është i vështirë, por çdo fund është i bukur.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-sq_AL-edon-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx".into()),
                tokens: Some("vits-piper-sq_AL-edon-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-sq_AL-edon-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Çdo fillim është i vështirë, por çdo fund është i bukur.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-sq_AL-edon-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx',
        tokens: 'vits-piper-sq_AL-edon-medium/tokens.txt',
        dataDir: 'vits-piper-sq_AL-edon-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Çdo fillim është i vështirë, por çdo fund është i bukur.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-sq_AL-edon-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx',
    tokens: 'vits-piper-sq_AL-edon-medium/tokens.txt',
    dataDir: 'vits-piper-sq_AL-edon-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Çdo fillim është i vështirë, por çdo fund është i bukur.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-sq_AL-edon-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-sq_AL-edon-medium/tokens.txt",
    dataDir: "vits-piper-sq_AL-edon-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Çdo fillim është i vështirë, por çdo fund është i bukur."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-sq_AL-edon-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-sq_AL-edon-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-sq_AL-edon-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Çdo fillim është i vështirë, por çdo fund është i bukur.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-sq_AL-edon-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx",
        tokens = "vits-piper-sq_AL-edon-medium/tokens.txt",
        dataDir = "vits-piper-sq_AL-edon-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Çdo fillim është i vështirë, por çdo fund është i bukur.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-sq_AL-edon-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx");
    vits.setTokens("vits-piper-sq_AL-edon-medium/tokens.txt");
    vits.setDataDir("vits-piper-sq_AL-edon-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Çdo fillim është i vështirë, por çdo fund është i bukur.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-sq_AL-edon-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-sq_AL-edon-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-sq_AL-edon-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Çdo fillim është i vështirë, por çdo fund është i bukur.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-sq_AL-edon-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx",
				Tokens:  "vits-piper-sq_AL-edon-medium/tokens.txt",
				DataDir: "vits-piper-sq_AL-edon-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Çdo fillim është i vështirë, por çdo fund është i bukur."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Çdo fillim është i vështirë, por çdo fund është i bukur.

sample audios for different speakers are listed below:

Speaker 0

Basque

This section lists text to speech models for Basque.

vits-piper-eu_ES-antton-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/eu/eu_ES/antton/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-eu_ES-antton-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-eu_ES-antton-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx";
  config.model.vits.tokens = "vits-piper-eu_ES-antton-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-eu_ES-antton-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Aberats izatea baino, izen ona hobe.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-eu_ES-antton-medium.tar.bz2

You can use the following code to play with vits-piper-eu_ES-antton-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx",
            data_dir="vits-piper-eu_ES-antton-medium/espeak-ng-data",
            tokens="vits-piper-eu_ES-antton-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Aberats izatea baino, izen ona hobe.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-eu_ES-antton-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx";
  config.model.vits.tokens = "vits-piper-eu_ES-antton-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-eu_ES-antton-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Aberats izatea baino, izen ona hobe.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-eu_ES-antton-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx".into()),
                tokens: Some("vits-piper-eu_ES-antton-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-eu_ES-antton-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Aberats izatea baino, izen ona hobe.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-eu_ES-antton-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx',
        tokens: 'vits-piper-eu_ES-antton-medium/tokens.txt',
        dataDir: 'vits-piper-eu_ES-antton-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Aberats izatea baino, izen ona hobe.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-eu_ES-antton-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx',
    tokens: 'vits-piper-eu_ES-antton-medium/tokens.txt',
    dataDir: 'vits-piper-eu_ES-antton-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Aberats izatea baino, izen ona hobe.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-eu_ES-antton-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-eu_ES-antton-medium/tokens.txt",
    dataDir: "vits-piper-eu_ES-antton-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Aberats izatea baino, izen ona hobe."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-eu_ES-antton-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-eu_ES-antton-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-eu_ES-antton-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Aberats izatea baino, izen ona hobe.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-eu_ES-antton-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx",
        tokens = "vits-piper-eu_ES-antton-medium/tokens.txt",
        dataDir = "vits-piper-eu_ES-antton-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Aberats izatea baino, izen ona hobe.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-eu_ES-antton-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx");
    vits.setTokens("vits-piper-eu_ES-antton-medium/tokens.txt");
    vits.setDataDir("vits-piper-eu_ES-antton-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Aberats izatea baino, izen ona hobe.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-eu_ES-antton-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-eu_ES-antton-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-eu_ES-antton-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Aberats izatea baino, izen ona hobe.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-eu_ES-antton-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx",
				Tokens:  "vits-piper-eu_ES-antton-medium/tokens.txt",
				DataDir: "vits-piper-eu_ES-antton-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Aberats izatea baino, izen ona hobe."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Aberats izatea baino, izen ona hobe.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-eu_ES-maider-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/eu/eu_ES/maider/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-eu_ES-maider-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-eu_ES-maider-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx";
  config.model.vits.tokens = "vits-piper-eu_ES-maider-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-eu_ES-maider-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Aberats izatea baino, izen ona hobe.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-eu_ES-maider-medium.tar.bz2

You can use the following code to play with vits-piper-eu_ES-maider-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx",
            data_dir="vits-piper-eu_ES-maider-medium/espeak-ng-data",
            tokens="vits-piper-eu_ES-maider-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Aberats izatea baino, izen ona hobe.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-eu_ES-maider-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx";
  config.model.vits.tokens = "vits-piper-eu_ES-maider-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-eu_ES-maider-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Aberats izatea baino, izen ona hobe.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-eu_ES-maider-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx".into()),
                tokens: Some("vits-piper-eu_ES-maider-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-eu_ES-maider-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Aberats izatea baino, izen ona hobe.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-eu_ES-maider-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx',
        tokens: 'vits-piper-eu_ES-maider-medium/tokens.txt',
        dataDir: 'vits-piper-eu_ES-maider-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Aberats izatea baino, izen ona hobe.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-eu_ES-maider-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx',
    tokens: 'vits-piper-eu_ES-maider-medium/tokens.txt',
    dataDir: 'vits-piper-eu_ES-maider-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Aberats izatea baino, izen ona hobe.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-eu_ES-maider-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-eu_ES-maider-medium/tokens.txt",
    dataDir: "vits-piper-eu_ES-maider-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Aberats izatea baino, izen ona hobe."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-eu_ES-maider-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-eu_ES-maider-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-eu_ES-maider-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Aberats izatea baino, izen ona hobe.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-eu_ES-maider-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx",
        tokens = "vits-piper-eu_ES-maider-medium/tokens.txt",
        dataDir = "vits-piper-eu_ES-maider-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Aberats izatea baino, izen ona hobe.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-eu_ES-maider-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx");
    vits.setTokens("vits-piper-eu_ES-maider-medium/tokens.txt");
    vits.setDataDir("vits-piper-eu_ES-maider-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Aberats izatea baino, izen ona hobe.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-eu_ES-maider-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-eu_ES-maider-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-eu_ES-maider-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Aberats izatea baino, izen ona hobe.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-eu_ES-maider-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx",
				Tokens:  "vits-piper-eu_ES-maider-medium/tokens.txt",
				DataDir: "vits-piper-eu_ES-maider-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Aberats izatea baino, izen ona hobe."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Aberats izatea baino, izen ona hobe.

sample audios for different speakers are listed below:

Speaker 0

Bulgarian

This section lists text to speech models for Bulgarian.

supertonic-3-bg

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Bulgarian (bg).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "bg"

audio = tts.generate("Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"bg\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "bg"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "bg"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'bg'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'bg'},
  );
  final audio = tts.generateWithConfig(text: 'Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "bg"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"bg\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "bg"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"bg\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "bg"}';

  Audio := Tts.GenerateWithConfig('Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "bg"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Здравей свят.

1

Как си днес?

2

Небето е синьо, а вятърът е тих.

3

Машинното обучение помага на компютрите да учат от данни.

4

Синтезът на реч превръща текст в ясен звук.

5

Учениците прочетоха кратка история в библиотеката.

6

Влакът закъсня заради поддръжка на релсите.

7

Малките модели работят бързо на локални устройства.

8

Гласовите асистенти улесняват ежедневните задачи.

9

Стабилното четене е важно за дълги и кратки изречения.

Speaker 1

0

Здравей свят.

1

Как си днес?

2

Небето е синьо, а вятърът е тих.

3

Машинното обучение помага на компютрите да учат от данни.

4

Синтезът на реч превръща текст в ясен звук.

5

Учениците прочетоха кратка история в библиотеката.

6

Влакът закъсня заради поддръжка на релсите.

7

Малките модели работят бързо на локални устройства.

8

Гласовите асистенти улесняват ежедневните задачи.

9

Стабилното четене е важно за дълги и кратки изречения.

Speaker 2

0

Здравей свят.

1

Как си днес?

2

Небето е синьо, а вятърът е тих.

3

Машинното обучение помага на компютрите да учат от данни.

4

Синтезът на реч превръща текст в ясен звук.

5

Учениците прочетоха кратка история в библиотеката.

6

Влакът закъсня заради поддръжка на релсите.

7

Малките модели работят бързо на локални устройства.

8

Гласовите асистенти улесняват ежедневните задачи.

9

Стабилното четене е важно за дълги и кратки изречения.

Speaker 3

0

Здравей свят.

1

Как си днес?

2

Небето е синьо, а вятърът е тих.

3

Машинното обучение помага на компютрите да учат от данни.

4

Синтезът на реч превръща текст в ясен звук.

5

Учениците прочетоха кратка история в библиотеката.

6

Влакът закъсня заради поддръжка на релсите.

7

Малките модели работят бързо на локални устройства.

8

Гласовите асистенти улесняват ежедневните задачи.

9

Стабилното четене е важно за дълги и кратки изречения.

Speaker 4

0

Здравей свят.

1

Как си днес?

2

Небето е синьо, а вятърът е тих.

3

Машинното обучение помага на компютрите да учат от данни.

4

Синтезът на реч превръща текст в ясен звук.

5

Учениците прочетоха кратка история в библиотеката.

6

Влакът закъсня заради поддръжка на релсите.

7

Малките модели работят бързо на локални устройства.

8

Гласовите асистенти улесняват ежедневните задачи.

9

Стабилното четене е важно за дълги и кратки изречения.

Speaker 5

0

Здравей свят.

1

Как си днес?

2

Небето е синьо, а вятърът е тих.

3

Машинното обучение помага на компютрите да учат от данни.

4

Синтезът на реч превръща текст в ясен звук.

5

Учениците прочетоха кратка история в библиотеката.

6

Влакът закъсня заради поддръжка на релсите.

7

Малките модели работят бързо на локални устройства.

8

Гласовите асистенти улесняват ежедневните задачи.

9

Стабилното четене е важно за дълги и кратки изречения.

Speaker 6

0

Здравей свят.

1

Как си днес?

2

Небето е синьо, а вятърът е тих.

3

Машинното обучение помага на компютрите да учат от данни.

4

Синтезът на реч превръща текст в ясен звук.

5

Учениците прочетоха кратка история в библиотеката.

6

Влакът закъсня заради поддръжка на релсите.

7

Малките модели работят бързо на локални устройства.

8

Гласовите асистенти улесняват ежедневните задачи.

9

Стабилното четене е важно за дълги и кратки изречения.

Speaker 7

0

Здравей свят.

1

Как си днес?

2

Небето е синьо, а вятърът е тих.

3

Машинното обучение помага на компютрите да учат от данни.

4

Синтезът на реч превръща текст в ясен звук.

5

Учениците прочетоха кратка история в библиотеката.

6

Влакът закъсня заради поддръжка на релсите.

7

Малките модели работят бързо на локални устройства.

8

Гласовите асистенти улесняват ежедневните задачи.

9

Стабилното четене е важно за дълги и кратки изречения.

Speaker 8

0

Здравей свят.

1

Как си днес?

2

Небето е синьо, а вятърът е тих.

3

Машинното обучение помага на компютрите да учат от данни.

4

Синтезът на реч превръща текст в ясен звук.

5

Учениците прочетоха кратка история в библиотеката.

6

Влакът закъсня заради поддръжка на релсите.

7

Малките модели работят бързо на локални устройства.

8

Гласовите асистенти улесняват ежедневните задачи.

9

Стабилното четене е важно за дълги и кратки изречения.

Speaker 9

0

Здравей свят.

1

Как си днес?

2

Небето е синьо, а вятърът е тих.

3

Машинното обучение помага на компютрите да учат от данни.

4

Синтезът на реч превръща текст в ясен звук.

5

Учениците прочетоха кратка история в библиотеката.

6

Влакът закъсня заради поддръжка на релсите.

7

Малките модели работят бързо на локални устройства.

8

Гласовите асистенти улесняват ежедневните задачи.

9

Стабилното четене е важно за дълги и кратки изречения.

Catalan

This section lists text to speech models for Catalan.

vits-piper-ca_ES-upc_ona-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ca/ca_ES/upc_ona/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ca_ES-upc_ona-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx";
  config.model.vits.tokens = "vits-piper-ca_ES-upc_ona-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ca_ES-upc_ona-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Si vols estar ben servit, fes-te tu mateix el llit";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ca_ES-upc_ona-medium.tar.bz2

You can use the following code to play with vits-piper-ca_ES-upc_ona-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx",
            data_dir="vits-piper-ca_ES-upc_ona-medium/espeak-ng-data",
            tokens="vits-piper-ca_ES-upc_ona-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Si vols estar ben servit, fes-te tu mateix el llit",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx";
  config.model.vits.tokens = "vits-piper-ca_ES-upc_ona-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ca_ES-upc_ona-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Si vols estar ben servit, fes-te tu mateix el llit";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx".into()),
                tokens: Some("vits-piper-ca_ES-upc_ona-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-ca_ES-upc_ona-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Si vols estar ben servit, fes-te tu mateix el llit";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx',
        tokens: 'vits-piper-ca_ES-upc_ona-medium/tokens.txt',
        dataDir: 'vits-piper-ca_ES-upc_ona-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Si vols estar ben servit, fes-te tu mateix el llit';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx',
    tokens: 'vits-piper-ca_ES-upc_ona-medium/tokens.txt',
    dataDir: 'vits-piper-ca_ES-upc_ona-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Si vols estar ben servit, fes-te tu mateix el llit', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-ca_ES-upc_ona-medium/tokens.txt",
    dataDir: "vits-piper-ca_ES-upc_ona-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Si vols estar ben servit, fes-te tu mateix el llit"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ca_ES-upc_ona-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ca_ES-upc_ona-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Si vols estar ben servit, fes-te tu mateix el llit";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx",
        tokens = "vits-piper-ca_ES-upc_ona-medium/tokens.txt",
        dataDir = "vits-piper-ca_ES-upc_ona-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Si vols estar ben servit, fes-te tu mateix el llit",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx");
    vits.setTokens("vits-piper-ca_ES-upc_ona-medium/tokens.txt");
    vits.setDataDir("vits-piper-ca_ES-upc_ona-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Si vols estar ben servit, fes-te tu mateix el llit";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-ca_ES-upc_ona-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-ca_ES-upc_ona-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Si vols estar ben servit, fes-te tu mateix el llit', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx",
				Tokens:  "vits-piper-ca_ES-upc_ona-medium/tokens.txt",
				DataDir: "vits-piper-ca_ES-upc_ona-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Si vols estar ben servit, fes-te tu mateix el llit"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Si vols estar ben servit, fes-te tu mateix el llit

sample audios for different speakers are listed below:

Speaker 0

vits-piper-ca_ES-upc_ona-x_low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ca/ca_ES/upc_ona/x_low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ca_ES-upc_ona-x_low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx";
  config.model.vits.tokens = "vits-piper-ca_ES-upc_ona-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Si vols estar ben servit, fes-te tu mateix el llit";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ca_ES-upc_ona-x_low.tar.bz2

You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx",
            data_dir="vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data",
            tokens="vits-piper-ca_ES-upc_ona-x_low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Si vols estar ben servit, fes-te tu mateix el llit",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx";
  config.model.vits.tokens = "vits-piper-ca_ES-upc_ona-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Si vols estar ben servit, fes-te tu mateix el llit";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx".into()),
                tokens: Some("vits-piper-ca_ES-upc_ona-x_low/tokens.txt".into()),
                data_dir: Some("vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Si vols estar ben servit, fes-te tu mateix el llit";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx',
        tokens: 'vits-piper-ca_ES-upc_ona-x_low/tokens.txt',
        dataDir: 'vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Si vols estar ben servit, fes-te tu mateix el llit';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx',
    tokens: 'vits-piper-ca_ES-upc_ona-x_low/tokens.txt',
    dataDir: 'vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Si vols estar ben servit, fes-te tu mateix el llit', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx",
    lexicon: "",
    tokens: "vits-piper-ca_ES-upc_ona-x_low/tokens.txt",
    dataDir: "vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Si vols estar ben servit, fes-te tu mateix el llit"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-ca_ES-upc_ona-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Si vols estar ben servit, fes-te tu mateix el llit";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx",
        tokens = "vits-piper-ca_ES-upc_ona-x_low/tokens.txt",
        dataDir = "vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Si vols estar ben servit, fes-te tu mateix el llit",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx");
    vits.setTokens("vits-piper-ca_ES-upc_ona-x_low/tokens.txt");
    vits.setDataDir("vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Si vols estar ben servit, fes-te tu mateix el llit";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-ca_ES-upc_ona-x_low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Si vols estar ben servit, fes-te tu mateix el llit', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx",
				Tokens:  "vits-piper-ca_ES-upc_ona-x_low/tokens.txt",
				DataDir: "vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Si vols estar ben servit, fes-te tu mateix el llit"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Si vols estar ben servit, fes-te tu mateix el llit

sample audios for different speakers are listed below:

Speaker 0

vits-piper-ca_ES-upc_pau-x_low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ca/ca_ES/upc_pau/x_low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ca_ES-upc_pau-x_low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx";
  config.model.vits.tokens = "vits-piper-ca_ES-upc_pau-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Si vols estar ben servit, fes-te tu mateix el llit";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ca_ES-upc_pau-x_low.tar.bz2

You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx",
            data_dir="vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data",
            tokens="vits-piper-ca_ES-upc_pau-x_low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Si vols estar ben servit, fes-te tu mateix el llit",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx";
  config.model.vits.tokens = "vits-piper-ca_ES-upc_pau-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Si vols estar ben servit, fes-te tu mateix el llit";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx".into()),
                tokens: Some("vits-piper-ca_ES-upc_pau-x_low/tokens.txt".into()),
                data_dir: Some("vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Si vols estar ben servit, fes-te tu mateix el llit";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx',
        tokens: 'vits-piper-ca_ES-upc_pau-x_low/tokens.txt',
        dataDir: 'vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Si vols estar ben servit, fes-te tu mateix el llit';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx',
    tokens: 'vits-piper-ca_ES-upc_pau-x_low/tokens.txt',
    dataDir: 'vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Si vols estar ben servit, fes-te tu mateix el llit', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx",
    lexicon: "",
    tokens: "vits-piper-ca_ES-upc_pau-x_low/tokens.txt",
    dataDir: "vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Si vols estar ben servit, fes-te tu mateix el llit"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-ca_ES-upc_pau-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Si vols estar ben servit, fes-te tu mateix el llit";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx",
        tokens = "vits-piper-ca_ES-upc_pau-x_low/tokens.txt",
        dataDir = "vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Si vols estar ben servit, fes-te tu mateix el llit",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx");
    vits.setTokens("vits-piper-ca_ES-upc_pau-x_low/tokens.txt");
    vits.setDataDir("vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Si vols estar ben servit, fes-te tu mateix el llit";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-ca_ES-upc_pau-x_low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Si vols estar ben servit, fes-te tu mateix el llit', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx",
				Tokens:  "vits-piper-ca_ES-upc_pau-x_low/tokens.txt",
				DataDir: "vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Si vols estar ben servit, fes-te tu mateix el llit"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Si vols estar ben servit, fes-te tu mateix el llit

sample audios for different speakers are listed below:

Speaker 0

Chinese

This section lists text to speech models for Chinese.

vits-piper-zh_CN-chaowen-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/zh/zh_CN/chaowen/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-zh_CN-chaowen-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-zh_CN-chaowen-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx";
  config.model.vits.lexicon = "vits-piper-zh_CN-chaowen-medium/lexicon.txt";
  config.model.vits.tokens = "vits-piper-zh_CN-chaowen-medium/tokens.txt";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;
  config.rule_fsts = "vits-piper-zh_CN-chaowen-medium/phone.fst,vits-piper-zh_CN-chaowen-medium/date.fst,vits-piper-zh_CN-chaowen-medium/number.fst";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-zh_CN-chaowen-medium.tar.bz2

You can use the following code to play with vits-piper-zh_CN-chaowen-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx",
            lexicon="vits-piper-zh_CN-chaowen-medium/lexicon.txt",
            tokens="vits-piper-zh_CN-chaowen-medium/tokens.txt",
        ),
        num_threads=1,
    ),
    rule_fsts="vits-piper-zh_CN-chaowen-medium/phone.fst,vits-piper-zh_CN-chaowen-medium/date.fst,vits-piper-zh_CN-chaowen-medium/number.fst",
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-zh_CN-chaowen-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx";
  config.model.vits.lexicon = "vits-piper-zh_CN-chaowen-medium/lexicon.txt";
  config.model.vits.tokens = "vits-piper-zh_CN-chaowen-medium/tokens.txt";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;
  config.rule_fsts = "vits-piper-zh_CN-chaowen-medium/phone.fst,vits-piper-zh_CN-chaowen-medium/date.fst,vits-piper-zh_CN-chaowen-medium/number.fst";

  std::string filename = "./test.wav";
  std::string text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-zh_CN-chaowen-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx".into()),
                tokens: Some("vits-piper-zh_CN-chaowen-medium/tokens.txt".into()),
                lexicon: Some("vits-piper-zh_CN-chaowen-medium/lexicon.txt".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        rule_fsts: Some("vits-piper-zh_CN-chaowen-medium/phone.fst,vits-piper-zh_CN-chaowen-medium/date.fst,vits-piper-zh_CN-chaowen-medium/number.fst".into()),
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-zh_CN-chaowen-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx',
        tokens: 'vits-piper-zh_CN-chaowen-medium/tokens.txt',
        lexicon: 'vits-piper-zh_CN-chaowen-medium/lexicon.txt',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
    ruleFsts: 'vits-piper-zh_CN-chaowen-medium/phone.fst,vits-piper-zh_CN-chaowen-medium/date.fst,vits-piper-zh_CN-chaowen-medium/number.fst',
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = '某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-zh_CN-chaowen-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx',
    tokens: 'vits-piper-zh_CN-chaowen-medium/tokens.txt',
    dataDir: '',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: '某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-zh_CN-chaowen-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-zh_CN-chaowen-medium/tokens.txt",
    dataDir: ""
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-zh_CN-chaowen-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-zh_CN-chaowen-medium/tokens.txt";
config.Model.Vits.Lexicon = "vits-piper-zh_CN-chaowen-medium/lexicon.txt";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.RuleFsts = "vits-piper-zh_CN-chaowen-medium/phone.fst,vits-piper-zh_CN-chaowen-medium/date.fst,vits-piper-zh_CN-chaowen-medium/number.fst";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-zh_CN-chaowen-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx",
        tokens = "vits-piper-zh_CN-chaowen-medium/tokens.txt",
        lexicon = "vits-piper-zh_CN-chaowen-medium/lexicon.txt",
      ),
      numThreads = 1,
      debug = true,
    ),
    ruleFsts = "vits-piper-zh_CN-chaowen-medium/phone.fst,vits-piper-zh_CN-chaowen-medium/date.fst,vits-piper-zh_CN-chaowen-medium/number.fst",
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-zh_CN-chaowen-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx");
    vits.setTokens("vits-piper-zh_CN-chaowen-medium/tokens.txt");
    vits.setLexicon("vits-piper-zh_CN-chaowen-medium/lexicon.txt");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setRuleFsts("vits-piper-zh_CN-chaowen-medium/phone.fst,vits-piper-zh_CN-chaowen-medium/date.fst,vits-piper-zh_CN-chaowen-medium/number.fst");
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-zh_CN-chaowen-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-zh_CN-chaowen-medium/tokens.txt';
  Config.Model.Vits.Lexicon := 'vits-piper-zh_CN-chaowen-medium/lexicon.txt';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.RuleFsts := 'vits-piper-zh_CN-chaowen-medium/phone.fst,vits-piper-zh_CN-chaowen-medium/date.fst,vits-piper-zh_CN-chaowen-medium/number.fst';
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-zh_CN-chaowen-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx",
				Tokens:  "vits-piper-zh_CN-chaowen-medium/tokens.txt",
				DataDir: "",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-zh_CN-xiao_ya-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/zh/zh_CN/xiao_ya/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-zh_CN-xiao_ya-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx";
  config.model.vits.lexicon = "vits-piper-zh_CN-xiao_ya-medium/lexicon.txt";
  config.model.vits.tokens = "vits-piper-zh_CN-xiao_ya-medium/tokens.txt";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;
  config.rule_fsts = "vits-piper-zh_CN-xiao_ya-medium/phone.fst,vits-piper-zh_CN-xiao_ya-medium/date.fst,vits-piper-zh_CN-xiao_ya-medium/number.fst";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-zh_CN-xiao_ya-medium.tar.bz2

You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx",
            lexicon="vits-piper-zh_CN-xiao_ya-medium/lexicon.txt",
            tokens="vits-piper-zh_CN-xiao_ya-medium/tokens.txt",
        ),
        num_threads=1,
    ),
    rule_fsts="vits-piper-zh_CN-xiao_ya-medium/phone.fst,vits-piper-zh_CN-xiao_ya-medium/date.fst,vits-piper-zh_CN-xiao_ya-medium/number.fst",
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx";
  config.model.vits.lexicon = "vits-piper-zh_CN-xiao_ya-medium/lexicon.txt";
  config.model.vits.tokens = "vits-piper-zh_CN-xiao_ya-medium/tokens.txt";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;
  config.rule_fsts = "vits-piper-zh_CN-xiao_ya-medium/phone.fst,vits-piper-zh_CN-xiao_ya-medium/date.fst,vits-piper-zh_CN-xiao_ya-medium/number.fst";

  std::string filename = "./test.wav";
  std::string text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx".into()),
                tokens: Some("vits-piper-zh_CN-xiao_ya-medium/tokens.txt".into()),
                lexicon: Some("vits-piper-zh_CN-xiao_ya-medium/lexicon.txt".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        rule_fsts: Some("vits-piper-zh_CN-xiao_ya-medium/phone.fst,vits-piper-zh_CN-xiao_ya-medium/date.fst,vits-piper-zh_CN-xiao_ya-medium/number.fst".into()),
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx',
        tokens: 'vits-piper-zh_CN-xiao_ya-medium/tokens.txt',
        lexicon: 'vits-piper-zh_CN-xiao_ya-medium/lexicon.txt',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
    ruleFsts: 'vits-piper-zh_CN-xiao_ya-medium/phone.fst,vits-piper-zh_CN-xiao_ya-medium/date.fst,vits-piper-zh_CN-xiao_ya-medium/number.fst',
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = '某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx',
    tokens: 'vits-piper-zh_CN-xiao_ya-medium/tokens.txt',
    dataDir: '',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: '某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-zh_CN-xiao_ya-medium/tokens.txt",
    dataDir: ""
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-zh_CN-xiao_ya-medium/tokens.txt";
config.Model.Vits.Lexicon = "vits-piper-zh_CN-xiao_ya-medium/lexicon.txt";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.RuleFsts = "vits-piper-zh_CN-xiao_ya-medium/phone.fst,vits-piper-zh_CN-xiao_ya-medium/date.fst,vits-piper-zh_CN-xiao_ya-medium/number.fst";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx",
        tokens = "vits-piper-zh_CN-xiao_ya-medium/tokens.txt",
        lexicon = "vits-piper-zh_CN-xiao_ya-medium/lexicon.txt",
      ),
      numThreads = 1,
      debug = true,
    ),
    ruleFsts = "vits-piper-zh_CN-xiao_ya-medium/phone.fst,vits-piper-zh_CN-xiao_ya-medium/date.fst,vits-piper-zh_CN-xiao_ya-medium/number.fst",
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx");
    vits.setTokens("vits-piper-zh_CN-xiao_ya-medium/tokens.txt");
    vits.setLexicon("vits-piper-zh_CN-xiao_ya-medium/lexicon.txt");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setRuleFsts("vits-piper-zh_CN-xiao_ya-medium/phone.fst,vits-piper-zh_CN-xiao_ya-medium/date.fst,vits-piper-zh_CN-xiao_ya-medium/number.fst");
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-zh_CN-xiao_ya-medium/tokens.txt';
  Config.Model.Vits.Lexicon := 'vits-piper-zh_CN-xiao_ya-medium/lexicon.txt';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.RuleFsts := 'vits-piper-zh_CN-xiao_ya-medium/phone.fst,vits-piper-zh_CN-xiao_ya-medium/date.fst,vits-piper-zh_CN-xiao_ya-medium/number.fst';
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx",
				Tokens:  "vits-piper-zh_CN-xiao_ya-medium/tokens.txt",
				DataDir: "",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.

sample audios for different speakers are listed below:

Speaker 0

matcha-icefall-zh-baker

Info about this model

This model is trained using the code from https://github.com/k2-fsa/icefall/tree/master/egs/baker_zh/TTS/matcha

It supports only Chinese.

Number of speakersSample rate
122050

Download the model

Click to expand

You need to download the acoustic model and the vocoder model.

Download the acoustic model

Please use the following code to download the model:

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2

tar xvf matcha-icefall-zh-baker.tar.bz2
rm matcha-icefall-zh-baker.tar.bz2

You should see the following output:

ls -lh matcha-icefall-zh-baker/
total 150848
-rw-r--r--@  1 fangjun  staff    58K  6 Oct 08:39 date.fst
drwxr-xr-x@ 10 fangjun  staff   320B 18 Feb  2025 dict
-rw-r--r--@  1 fangjun  staff   1.3M  6 Oct 08:39 lexicon.txt
-rw-r--r--@  1 fangjun  staff    72M  6 Oct 08:39 model-steps-3.onnx
-rw-r--r--@  1 fangjun  staff    63K  6 Oct 08:39 number.fst
-rw-r--r--@  1 fangjun  staff    87K  6 Oct 08:39 phone.fst
-rw-r--r--@  1 fangjun  staff   370B  6 Oct 08:39 README.md
-rw-r--r--@  1 fangjun  staff    19K  6 Oct 08:39 tokens.txt

Note: The dict directory is no longer needed for this model.

Download the vocoder model

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx

You should see the following output

ls -lh vocos-22khz-univ.onnx

-rw-r--r--@ 1 fangjun  staff    51M 17 Mar  2025 vocos-22khz-univ.onnx

Huggingface space

You can try this model by visiting https://huggingface.co/spaces/k2-fsa/text-to-speech

Huggingface space (WebAssembly, wasm)

You can try this model by visiting

https://huggingface.co/spaces/k2-fsa/web-assembly-zh-tts-matcha

The source code is available at https://github.com/k2-fsa/sherpa-onnx/tree/master/wasm/tts

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

The following code shows how to use the Python API of sherpa-onnx with this model.

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        matcha=sherpa_onnx.OfflineTtsMatchaModelConfig(
            acoustic_model="matcha-icefall-zh-baker/model-steps-3.onnx",
            vocoder="vocos-22khz-univ.onnx",
            lexicon="matcha-icefall-zh-baker/lexicon.txt",
            tokens="matcha-icefall-zh-baker/tokens.txt",
        ),
        num_threads=2,
        debug=True, # set it False to disable debug output
    ),
    max_num_sentences=1,
    rule_fsts="matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst",
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔."


audio = tts.generate(text, sid=0, speed=1.0)

sf.write(
    "./test.mp3",
    audio.samples,
    samplerate=audio.sample_rate,
)

You can save it as test-zh.py and then run:

pip install sherpa-onnx soundfile

python3 ./test-zh.py

You will get a file test.mp3 in the end.

C API

Click to expand

You can use the following code to play with matcha-icefall-zh-baker using C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.matcha.acoustic_model = "matcha-icefall-zh-baker/model-steps-3.onnx";
  config.model.matcha.vocoder = "vocos-22khz-univ.onnx";
  config.model.matcha.lexicon = "matcha-icefall-zh-baker/lexicon.txt";
  config.model.matcha.tokens = "matcha-icefall-zh-baker/tokens.txt";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;
  config.rule_fsts = "matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-zh.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-zh \
  /tmp/test-zh.c

Now you can run

cd /tmp

# Assume you have downloaded the acoustic model as well as the vocoder model and put them to /tmp
./test-zh

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-zh.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with matcha-icefall-zh-baker using C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.matcha.acoustic_model = "matcha-icefall-zh-baker/model-steps-3.onnx";
  config.model.matcha.vocoder = "vocos-22khz-univ.onnx";
  config.model.matcha.lexicon = "matcha-icefall-zh-baker/lexicon.txt";
  config.model.matcha.tokens = "matcha-icefall-zh-baker/tokens.txt";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;
  config.rule_fsts = "matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst";

  std::string filename = "./test.wav";
  std::string text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-zh.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-zh \
  /tmp/test-zh.cc

Now you can run

cd /tmp

# Assume you have downloaded the acoustic model as well as the vocoder model and put them to /tmp
./test-zh

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-zh.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with matcha-icefall-zh-baker with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsMatchaModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            matcha: OfflineTtsMatchaModelConfig {
                acoustic_model: Some("matcha-icefall-zh-baker/model-steps-3.onnx".into()),
                vocoder: Some("vocos-22khz-univ.onnx".into()),
                tokens: Some("matcha-icefall-zh-baker/tokens.txt".into()),
                lexicon: Some("matcha-icefall-zh-baker/lexicon.txt".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        rule_fsts: Some("matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst".into()),
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with matcha-icefall-zh-baker with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      matcha: {
        acousticModel: 'matcha-icefall-zh-baker/model-steps-3.onnx',
        vocoder: 'vocos-22khz-univ.onnx',
        tokens: 'matcha-icefall-zh-baker/tokens.txt',
        lexicon: 'matcha-icefall-zh-baker/lexicon.txt',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
    ruleFsts: 'matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst',
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = '某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with matcha-icefall-zh-baker with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final matcha = sherpa_onnx.OfflineTtsMatchaModelConfig(
    acousticModel: 'matcha-icefall-zh-baker/model-steps-3.onnx',
    vocoder: 'vocos-22khz-univ.onnx',
    tokens: 'matcha-icefall-zh-baker/tokens.txt',
    lexicon: 'matcha-icefall-zh-baker/lexicon.txt',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    matcha: matcha,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: '某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with matcha-icefall-zh-baker with Swift API.

func run() {
  let matcha = sherpaOnnxOfflineTtsMatchaModelConfig(
    acousticModel: "matcha-icefall-zh-baker/model-steps-3.onnx",
    vocoder: "vocos-22khz-univ.onnx",
    tokens: "matcha-icefall-zh-baker/tokens.txt",
    dataDir: "",
    lexicon: "matcha-icefall-zh-baker/lexicon.txt"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(matcha: matcha)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with matcha-icefall-zh-baker with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Matcha.AcousticModel = "matcha-icefall-zh-baker/model-steps-3.onnx";
config.Model.Matcha.Vocoder = "vocos-22khz-univ.onnx";
config.Model.Matcha.Tokens = "matcha-icefall-zh-baker/tokens.txt";
config.Model.Matcha.Lexicon = "matcha-icefall-zh-baker/lexicon.txt";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.RuleFsts = "matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with matcha-icefall-zh-baker with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      matcha = OfflineTtsMatchaModelConfig(
        acousticModel = "matcha-icefall-zh-baker/model-steps-3.onnx",
        vocoder = "vocos-22khz-univ.onnx",
        tokens = "matcha-icefall-zh-baker/tokens.txt",
        lexicon = "matcha-icefall-zh-baker/lexicon.txt",
      ),
      numThreads = 1,
      debug = true,
    ),
    ruleFsts = "matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst",
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with matcha-icefall-zh-baker with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var matcha = new OfflineTtsMatchaModelConfig();
    matcha.setAcousticModel("matcha-icefall-zh-baker/model-steps-3.onnx");
    matcha.setVocoder("vocos-22khz-univ.onnx");
    matcha.setTokens("matcha-icefall-zh-baker/tokens.txt");
    matcha.setLexicon("matcha-icefall-zh-baker/lexicon.txt");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setMatcha(matcha);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setRuleFsts("matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst");
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with matcha-icefall-zh-baker with Pascal API.

program test_matcha;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Matcha.AcousticModel := 'matcha-icefall-zh-baker/model-steps-3.onnx';
  Config.Model.Matcha.Vocoder := 'vocos-22khz-univ.onnx';
  Config.Model.Matcha.Tokens := 'matcha-icefall-zh-baker/tokens.txt';
  Config.Model.Matcha.Lexicon := 'matcha-icefall-zh-baker/lexicon.txt';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.RuleFsts := 'matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst';
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with matcha-icefall-zh-baker with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Matcha: sherpa.OfflineTtsMatchaModelConfig{
				AcousticModel: "matcha-icefall-zh-baker/model-steps-3.onnx",
				Vocoder:       "vocos-22khz-univ.onnx",
				Tokens:        "matcha-icefall-zh-baker/tokens.txt",
				Lexicon:       "matcha-icefall-zh-baker/lexicon.txt",
			},
			NumThreads: 1,
			Debug:      true,
		},
		RuleFsts: "matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst",
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.

sample audios for different speakers are listed below:

Speaker 0

Croatian

This section lists text to speech models for Croatian.

supertonic-3-hr

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Croatian (hr).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "hr"

audio = tts.generate("Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"hr\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "hr"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "hr"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'hr'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'hr'},
  );
  final audio = tts.generateWithConfig(text: 'Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "hr"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"hr\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "hr"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"hr\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "hr"}';

  Audio := Tts.GenerateWithConfig('Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "hr"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Pozdrav svijete.

1

Kako si danas?

2

Nebo je plavo, a vjetar je blag.

3

Strojno učenje pomaže računalima učiti iz podataka.

4

Sinteza govora pretvara tekst u jasan zvuk.

5

Učenici su u knjižnici pročitali kratku priču.

6

Vlak je kasnio zbog održavanja pruge.

7

Mali modeli brzo rade na lokalnim uređajima.

8

Glasovni asistent pomaže u svakodnevnim zadacima.

9

Stabilno čitanje važno je za kratke i duge rečenice.

Speaker 1

0

Pozdrav svijete.

1

Kako si danas?

2

Nebo je plavo, a vjetar je blag.

3

Strojno učenje pomaže računalima učiti iz podataka.

4

Sinteza govora pretvara tekst u jasan zvuk.

5

Učenici su u knjižnici pročitali kratku priču.

6

Vlak je kasnio zbog održavanja pruge.

7

Mali modeli brzo rade na lokalnim uređajima.

8

Glasovni asistent pomaže u svakodnevnim zadacima.

9

Stabilno čitanje važno je za kratke i duge rečenice.

Speaker 2

0

Pozdrav svijete.

1

Kako si danas?

2

Nebo je plavo, a vjetar je blag.

3

Strojno učenje pomaže računalima učiti iz podataka.

4

Sinteza govora pretvara tekst u jasan zvuk.

5

Učenici su u knjižnici pročitali kratku priču.

6

Vlak je kasnio zbog održavanja pruge.

7

Mali modeli brzo rade na lokalnim uređajima.

8

Glasovni asistent pomaže u svakodnevnim zadacima.

9

Stabilno čitanje važno je za kratke i duge rečenice.

Speaker 3

0

Pozdrav svijete.

1

Kako si danas?

2

Nebo je plavo, a vjetar je blag.

3

Strojno učenje pomaže računalima učiti iz podataka.

4

Sinteza govora pretvara tekst u jasan zvuk.

5

Učenici su u knjižnici pročitali kratku priču.

6

Vlak je kasnio zbog održavanja pruge.

7

Mali modeli brzo rade na lokalnim uređajima.

8

Glasovni asistent pomaže u svakodnevnim zadacima.

9

Stabilno čitanje važno je za kratke i duge rečenice.

Speaker 4

0

Pozdrav svijete.

1

Kako si danas?

2

Nebo je plavo, a vjetar je blag.

3

Strojno učenje pomaže računalima učiti iz podataka.

4

Sinteza govora pretvara tekst u jasan zvuk.

5

Učenici su u knjižnici pročitali kratku priču.

6

Vlak je kasnio zbog održavanja pruge.

7

Mali modeli brzo rade na lokalnim uređajima.

8

Glasovni asistent pomaže u svakodnevnim zadacima.

9

Stabilno čitanje važno je za kratke i duge rečenice.

Speaker 5

0

Pozdrav svijete.

1

Kako si danas?

2

Nebo je plavo, a vjetar je blag.

3

Strojno učenje pomaže računalima učiti iz podataka.

4

Sinteza govora pretvara tekst u jasan zvuk.

5

Učenici su u knjižnici pročitali kratku priču.

6

Vlak je kasnio zbog održavanja pruge.

7

Mali modeli brzo rade na lokalnim uređajima.

8

Glasovni asistent pomaže u svakodnevnim zadacima.

9

Stabilno čitanje važno je za kratke i duge rečenice.

Speaker 6

0

Pozdrav svijete.

1

Kako si danas?

2

Nebo je plavo, a vjetar je blag.

3

Strojno učenje pomaže računalima učiti iz podataka.

4

Sinteza govora pretvara tekst u jasan zvuk.

5

Učenici su u knjižnici pročitali kratku priču.

6

Vlak je kasnio zbog održavanja pruge.

7

Mali modeli brzo rade na lokalnim uređajima.

8

Glasovni asistent pomaže u svakodnevnim zadacima.

9

Stabilno čitanje važno je za kratke i duge rečenice.

Speaker 7

0

Pozdrav svijete.

1

Kako si danas?

2

Nebo je plavo, a vjetar je blag.

3

Strojno učenje pomaže računalima učiti iz podataka.

4

Sinteza govora pretvara tekst u jasan zvuk.

5

Učenici su u knjižnici pročitali kratku priču.

6

Vlak je kasnio zbog održavanja pruge.

7

Mali modeli brzo rade na lokalnim uređajima.

8

Glasovni asistent pomaže u svakodnevnim zadacima.

9

Stabilno čitanje važno je za kratke i duge rečenice.

Speaker 8

0

Pozdrav svijete.

1

Kako si danas?

2

Nebo je plavo, a vjetar je blag.

3

Strojno učenje pomaže računalima učiti iz podataka.

4

Sinteza govora pretvara tekst u jasan zvuk.

5

Učenici su u knjižnici pročitali kratku priču.

6

Vlak je kasnio zbog održavanja pruge.

7

Mali modeli brzo rade na lokalnim uređajima.

8

Glasovni asistent pomaže u svakodnevnim zadacima.

9

Stabilno čitanje važno je za kratke i duge rečenice.

Speaker 9

0

Pozdrav svijete.

1

Kako si danas?

2

Nebo je plavo, a vjetar je blag.

3

Strojno učenje pomaže računalima učiti iz podataka.

4

Sinteza govora pretvara tekst u jasan zvuk.

5

Učenici su u knjižnici pročitali kratku priču.

6

Vlak je kasnio zbog održavanja pruge.

7

Mali modeli brzo rade na lokalnim uređajima.

8

Glasovni asistent pomaže u svakodnevnim zadacima.

9

Stabilno čitanje važno je za kratke i duge rečenice.

Czech

This section lists text to speech models for Czech.

vits-piper-cs_CZ-jirka-low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/cs/cs_CZ/jirka/low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-cs_CZ-jirka-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-cs_CZ-jirka-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx";
  config.model.vits.tokens = "vits-piper-cs_CZ-jirka-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-cs_CZ-jirka-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Co můžeš udělat dnes, neodkládej na zítřek. ";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-cs_CZ-jirka-low.tar.bz2

You can use the following code to play with vits-piper-cs_CZ-jirka-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx",
            data_dir="vits-piper-cs_CZ-jirka-low/espeak-ng-data",
            tokens="vits-piper-cs_CZ-jirka-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Co můžeš udělat dnes, neodkládej na zítřek. ",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-cs_CZ-jirka-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx";
  config.model.vits.tokens = "vits-piper-cs_CZ-jirka-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-cs_CZ-jirka-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Co můžeš udělat dnes, neodkládej na zítřek. ";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-cs_CZ-jirka-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx".into()),
                tokens: Some("vits-piper-cs_CZ-jirka-low/tokens.txt".into()),
                data_dir: Some("vits-piper-cs_CZ-jirka-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Co můžeš udělat dnes, neodkládej na zítřek. ";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-cs_CZ-jirka-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx',
        tokens: 'vits-piper-cs_CZ-jirka-low/tokens.txt',
        dataDir: 'vits-piper-cs_CZ-jirka-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Co můžeš udělat dnes, neodkládej na zítřek. ';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-cs_CZ-jirka-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx',
    tokens: 'vits-piper-cs_CZ-jirka-low/tokens.txt',
    dataDir: 'vits-piper-cs_CZ-jirka-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Co můžeš udělat dnes, neodkládej na zítřek. ', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-cs_CZ-jirka-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx",
    lexicon: "",
    tokens: "vits-piper-cs_CZ-jirka-low/tokens.txt",
    dataDir: "vits-piper-cs_CZ-jirka-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Co můžeš udělat dnes, neodkládej na zítřek. "
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-cs_CZ-jirka-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx";
config.Model.Vits.Tokens = "vits-piper-cs_CZ-jirka-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-cs_CZ-jirka-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Co můžeš udělat dnes, neodkládej na zítřek. ";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-cs_CZ-jirka-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx",
        tokens = "vits-piper-cs_CZ-jirka-low/tokens.txt",
        dataDir = "vits-piper-cs_CZ-jirka-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Co můžeš udělat dnes, neodkládej na zítřek. ",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-cs_CZ-jirka-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx");
    vits.setTokens("vits-piper-cs_CZ-jirka-low/tokens.txt");
    vits.setDataDir("vits-piper-cs_CZ-jirka-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Co můžeš udělat dnes, neodkládej na zítřek. ";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-cs_CZ-jirka-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-cs_CZ-jirka-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-cs_CZ-jirka-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Co můžeš udělat dnes, neodkládej na zítřek. ', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-cs_CZ-jirka-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx",
				Tokens:  "vits-piper-cs_CZ-jirka-low/tokens.txt",
				DataDir: "vits-piper-cs_CZ-jirka-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Co můžeš udělat dnes, neodkládej na zítřek. "

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Co můžeš udělat dnes, neodkládej na zítřek.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-cs_CZ-jirka-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/cs/cs_CZ/jirka/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-cs_CZ-jirka-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-cs_CZ-jirka-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx";
  config.model.vits.tokens = "vits-piper-cs_CZ-jirka-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-cs_CZ-jirka-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Co můžeš udělat dnes, neodkládej na zítřek. ";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-cs_CZ-jirka-medium.tar.bz2

You can use the following code to play with vits-piper-cs_CZ-jirka-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx",
            data_dir="vits-piper-cs_CZ-jirka-medium/espeak-ng-data",
            tokens="vits-piper-cs_CZ-jirka-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Co můžeš udělat dnes, neodkládej na zítřek. ",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-cs_CZ-jirka-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx";
  config.model.vits.tokens = "vits-piper-cs_CZ-jirka-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-cs_CZ-jirka-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Co můžeš udělat dnes, neodkládej na zítřek. ";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-cs_CZ-jirka-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx".into()),
                tokens: Some("vits-piper-cs_CZ-jirka-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-cs_CZ-jirka-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Co můžeš udělat dnes, neodkládej na zítřek. ";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-cs_CZ-jirka-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx',
        tokens: 'vits-piper-cs_CZ-jirka-medium/tokens.txt',
        dataDir: 'vits-piper-cs_CZ-jirka-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Co můžeš udělat dnes, neodkládej na zítřek. ';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-cs_CZ-jirka-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx',
    tokens: 'vits-piper-cs_CZ-jirka-medium/tokens.txt',
    dataDir: 'vits-piper-cs_CZ-jirka-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Co můžeš udělat dnes, neodkládej na zítřek. ', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-cs_CZ-jirka-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-cs_CZ-jirka-medium/tokens.txt",
    dataDir: "vits-piper-cs_CZ-jirka-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Co můžeš udělat dnes, neodkládej na zítřek. "
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-cs_CZ-jirka-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-cs_CZ-jirka-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-cs_CZ-jirka-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Co můžeš udělat dnes, neodkládej na zítřek. ";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-cs_CZ-jirka-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx",
        tokens = "vits-piper-cs_CZ-jirka-medium/tokens.txt",
        dataDir = "vits-piper-cs_CZ-jirka-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Co můžeš udělat dnes, neodkládej na zítřek. ",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-cs_CZ-jirka-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx");
    vits.setTokens("vits-piper-cs_CZ-jirka-medium/tokens.txt");
    vits.setDataDir("vits-piper-cs_CZ-jirka-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Co můžeš udělat dnes, neodkládej na zítřek. ";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-cs_CZ-jirka-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-cs_CZ-jirka-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-cs_CZ-jirka-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Co můžeš udělat dnes, neodkládej na zítřek. ', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-cs_CZ-jirka-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx",
				Tokens:  "vits-piper-cs_CZ-jirka-medium/tokens.txt",
				DataDir: "vits-piper-cs_CZ-jirka-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Co můžeš udělat dnes, neodkládej na zítřek. "

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Co můžeš udělat dnes, neodkládej na zítřek.

sample audios for different speakers are listed below:

Speaker 0

supertonic-3-cs

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Czech (cs).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "cs"

audio = tts.generate("Toto je převodník textu na řeč využívající novou generaci kaldi", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Toto je převodník textu na řeč využívající novou generaci kaldi";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"cs\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Toto je převodník textu na řeč využívající novou generaci kaldi";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "cs"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Toto je převodník textu na řeč využívající novou generaci kaldi";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "cs"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Toto je převodník textu na řeč využívající novou generaci kaldi';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'cs'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'cs'},
  );
  final audio = tts.generateWithConfig(text: 'Toto je převodník textu na řeč využívající novou generaci kaldi', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Toto je převodník textu na řeč využívající novou generaci kaldi"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "cs"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Toto je převodník textu na řeč využívající novou generaci kaldi";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"cs\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "cs"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Toto je převodník textu na řeč využívající novou generaci kaldi",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Toto je převodník textu na řeč využívající novou generaci kaldi";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"cs\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "cs"}';

  Audio := Tts.GenerateWithConfig('Toto je převodník textu na řeč využívající novou generaci kaldi', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Toto je převodník textu na řeč využívající novou generaci kaldi"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "cs"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Ahoj světe.

1

Jak se dnes máš?

2

Obloha je modrá a vítr je mírný.

3

Strojové učení pomáhá počítačům učit se z dat.

4

Syntéza řeči převádí text na srozumitelný zvuk.

5

Studenti četli krátký příběh v knihovně.

6

Vlak měl zpoždění kvůli údržbě trati.

7

Malé modely běží rychle na místních zařízeních.

8

Hlasový asistent pomáhá s každodenními úkoly.

9

Stabilní čtení je důležité pro dlouhé i krátké věty.

Speaker 1

0

Ahoj světe.

1

Jak se dnes máš?

2

Obloha je modrá a vítr je mírný.

3

Strojové učení pomáhá počítačům učit se z dat.

4

Syntéza řeči převádí text na srozumitelný zvuk.

5

Studenti četli krátký příběh v knihovně.

6

Vlak měl zpoždění kvůli údržbě trati.

7

Malé modely běží rychle na místních zařízeních.

8

Hlasový asistent pomáhá s každodenními úkoly.

9

Stabilní čtení je důležité pro dlouhé i krátké věty.

Speaker 2

0

Ahoj světe.

1

Jak se dnes máš?

2

Obloha je modrá a vítr je mírný.

3

Strojové učení pomáhá počítačům učit se z dat.

4

Syntéza řeči převádí text na srozumitelný zvuk.

5

Studenti četli krátký příběh v knihovně.

6

Vlak měl zpoždění kvůli údržbě trati.

7

Malé modely běží rychle na místních zařízeních.

8

Hlasový asistent pomáhá s každodenními úkoly.

9

Stabilní čtení je důležité pro dlouhé i krátké věty.

Speaker 3

0

Ahoj světe.

1

Jak se dnes máš?

2

Obloha je modrá a vítr je mírný.

3

Strojové učení pomáhá počítačům učit se z dat.

4

Syntéza řeči převádí text na srozumitelný zvuk.

5

Studenti četli krátký příběh v knihovně.

6

Vlak měl zpoždění kvůli údržbě trati.

7

Malé modely běží rychle na místních zařízeních.

8

Hlasový asistent pomáhá s každodenními úkoly.

9

Stabilní čtení je důležité pro dlouhé i krátké věty.

Speaker 4

0

Ahoj světe.

1

Jak se dnes máš?

2

Obloha je modrá a vítr je mírný.

3

Strojové učení pomáhá počítačům učit se z dat.

4

Syntéza řeči převádí text na srozumitelný zvuk.

5

Studenti četli krátký příběh v knihovně.

6

Vlak měl zpoždění kvůli údržbě trati.

7

Malé modely běží rychle na místních zařízeních.

8

Hlasový asistent pomáhá s každodenními úkoly.

9

Stabilní čtení je důležité pro dlouhé i krátké věty.

Speaker 5

0

Ahoj světe.

1

Jak se dnes máš?

2

Obloha je modrá a vítr je mírný.

3

Strojové učení pomáhá počítačům učit se z dat.

4

Syntéza řeči převádí text na srozumitelný zvuk.

5

Studenti četli krátký příběh v knihovně.

6

Vlak měl zpoždění kvůli údržbě trati.

7

Malé modely běží rychle na místních zařízeních.

8

Hlasový asistent pomáhá s každodenními úkoly.

9

Stabilní čtení je důležité pro dlouhé i krátké věty.

Speaker 6

0

Ahoj světe.

1

Jak se dnes máš?

2

Obloha je modrá a vítr je mírný.

3

Strojové učení pomáhá počítačům učit se z dat.

4

Syntéza řeči převádí text na srozumitelný zvuk.

5

Studenti četli krátký příběh v knihovně.

6

Vlak měl zpoždění kvůli údržbě trati.

7

Malé modely běží rychle na místních zařízeních.

8

Hlasový asistent pomáhá s každodenními úkoly.

9

Stabilní čtení je důležité pro dlouhé i krátké věty.

Speaker 7

0

Ahoj světe.

1

Jak se dnes máš?

2

Obloha je modrá a vítr je mírný.

3

Strojové učení pomáhá počítačům učit se z dat.

4

Syntéza řeči převádí text na srozumitelný zvuk.

5

Studenti četli krátký příběh v knihovně.

6

Vlak měl zpoždění kvůli údržbě trati.

7

Malé modely běží rychle na místních zařízeních.

8

Hlasový asistent pomáhá s každodenními úkoly.

9

Stabilní čtení je důležité pro dlouhé i krátké věty.

Speaker 8

0

Ahoj světe.

1

Jak se dnes máš?

2

Obloha je modrá a vítr je mírný.

3

Strojové učení pomáhá počítačům učit se z dat.

4

Syntéza řeči převádí text na srozumitelný zvuk.

5

Studenti četli krátký příběh v knihovně.

6

Vlak měl zpoždění kvůli údržbě trati.

7

Malé modely běží rychle na místních zařízeních.

8

Hlasový asistent pomáhá s každodenními úkoly.

9

Stabilní čtení je důležité pro dlouhé i krátké věty.

Speaker 9

0

Ahoj světe.

1

Jak se dnes máš?

2

Obloha je modrá a vítr je mírný.

3

Strojové učení pomáhá počítačům učit se z dat.

4

Syntéza řeči převádí text na srozumitelný zvuk.

5

Studenti četli krátký příběh v knihovně.

6

Vlak měl zpoždění kvůli údržbě trati.

7

Malé modely běží rychle na místních zařízeních.

8

Hlasový asistent pomáhá s každodenními úkoly.

9

Stabilní čtení je důležité pro dlouhé i krátké věty.

Danish

This section lists text to speech models for Danish.

vits-piper-da_DK-talesyntese-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/da/da_DK/talesyntese/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-da_DK-talesyntese-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-da_DK-talesyntese-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx";
  config.model.vits.tokens = "vits-piper-da_DK-talesyntese-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-da_DK-talesyntese-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-da_DK-talesyntese-medium.tar.bz2

You can use the following code to play with vits-piper-da_DK-talesyntese-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx",
            data_dir="vits-piper-da_DK-talesyntese-medium/espeak-ng-data",
            tokens="vits-piper-da_DK-talesyntese-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-da_DK-talesyntese-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx";
  config.model.vits.tokens = "vits-piper-da_DK-talesyntese-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-da_DK-talesyntese-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-da_DK-talesyntese-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx".into()),
                tokens: Some("vits-piper-da_DK-talesyntese-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-da_DK-talesyntese-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-da_DK-talesyntese-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx',
        tokens: 'vits-piper-da_DK-talesyntese-medium/tokens.txt',
        dataDir: 'vits-piper-da_DK-talesyntese-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-da_DK-talesyntese-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx',
    tokens: 'vits-piper-da_DK-talesyntese-medium/tokens.txt',
    dataDir: 'vits-piper-da_DK-talesyntese-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-da_DK-talesyntese-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-da_DK-talesyntese-medium/tokens.txt",
    dataDir: "vits-piper-da_DK-talesyntese-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-da_DK-talesyntese-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-da_DK-talesyntese-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-da_DK-talesyntese-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-da_DK-talesyntese-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx",
        tokens = "vits-piper-da_DK-talesyntese-medium/tokens.txt",
        dataDir = "vits-piper-da_DK-talesyntese-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-da_DK-talesyntese-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx");
    vits.setTokens("vits-piper-da_DK-talesyntese-medium/tokens.txt");
    vits.setDataDir("vits-piper-da_DK-talesyntese-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-da_DK-talesyntese-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-da_DK-talesyntese-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-da_DK-talesyntese-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-da_DK-talesyntese-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx",
				Tokens:  "vits-piper-da_DK-talesyntese-medium/tokens.txt",
				DataDir: "vits-piper-da_DK-talesyntese-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.

sample audios for different speakers are listed below:

Speaker 0

supertonic-3-da

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Danish (da).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "da"

audio = tts.generate("Dette er en tekst til tale-motor, der bruger næste generation af kaldi", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Dette er en tekst til tale-motor, der bruger næste generation af kaldi";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"da\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Dette er en tekst til tale-motor, der bruger næste generation af kaldi";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "da"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Dette er en tekst til tale-motor, der bruger næste generation af kaldi";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "da"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Dette er en tekst til tale-motor, der bruger næste generation af kaldi';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'da'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'da'},
  );
  final audio = tts.generateWithConfig(text: 'Dette er en tekst til tale-motor, der bruger næste generation af kaldi', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Dette er en tekst til tale-motor, der bruger næste generation af kaldi"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "da"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Dette er en tekst til tale-motor, der bruger næste generation af kaldi";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"da\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "da"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Dette er en tekst til tale-motor, der bruger næste generation af kaldi",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Dette er en tekst til tale-motor, der bruger næste generation af kaldi";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"da\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "da"}';

  Audio := Tts.GenerateWithConfig('Dette er en tekst til tale-motor, der bruger næste generation af kaldi', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Dette er en tekst til tale-motor, der bruger næste generation af kaldi"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "da"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Hej verden.

1

Hvordan har du det i dag?

2

Himlen er blå, og vinden er mild.

3

Maskinlæring hjælper computere med at lære af data.

4

Talesyntese omdanner tekst til tydelig lyd.

5

Eleverne læste en kort historie på biblioteket.

6

Toget blev forsinket på grund af sporarbejde.

7

Små modeller kører hurtigt på lokale enheder.

8

En stemmeassistent hjælper med daglige opgaver.

9

Stabil oplæsning er vigtig for både korte og lange sætninger.

Speaker 1

0

Hej verden.

1

Hvordan har du det i dag?

2

Himlen er blå, og vinden er mild.

3

Maskinlæring hjælper computere med at lære af data.

4

Talesyntese omdanner tekst til tydelig lyd.

5

Eleverne læste en kort historie på biblioteket.

6

Toget blev forsinket på grund af sporarbejde.

7

Små modeller kører hurtigt på lokale enheder.

8

En stemmeassistent hjælper med daglige opgaver.

9

Stabil oplæsning er vigtig for både korte og lange sætninger.

Speaker 2

0

Hej verden.

1

Hvordan har du det i dag?

2

Himlen er blå, og vinden er mild.

3

Maskinlæring hjælper computere med at lære af data.

4

Talesyntese omdanner tekst til tydelig lyd.

5

Eleverne læste en kort historie på biblioteket.

6

Toget blev forsinket på grund af sporarbejde.

7

Små modeller kører hurtigt på lokale enheder.

8

En stemmeassistent hjælper med daglige opgaver.

9

Stabil oplæsning er vigtig for både korte og lange sætninger.

Speaker 3

0

Hej verden.

1

Hvordan har du det i dag?

2

Himlen er blå, og vinden er mild.

3

Maskinlæring hjælper computere med at lære af data.

4

Talesyntese omdanner tekst til tydelig lyd.

5

Eleverne læste en kort historie på biblioteket.

6

Toget blev forsinket på grund af sporarbejde.

7

Små modeller kører hurtigt på lokale enheder.

8

En stemmeassistent hjælper med daglige opgaver.

9

Stabil oplæsning er vigtig for både korte og lange sætninger.

Speaker 4

0

Hej verden.

1

Hvordan har du det i dag?

2

Himlen er blå, og vinden er mild.

3

Maskinlæring hjælper computere med at lære af data.

4

Talesyntese omdanner tekst til tydelig lyd.

5

Eleverne læste en kort historie på biblioteket.

6

Toget blev forsinket på grund af sporarbejde.

7

Små modeller kører hurtigt på lokale enheder.

8

En stemmeassistent hjælper med daglige opgaver.

9

Stabil oplæsning er vigtig for både korte og lange sætninger.

Speaker 5

0

Hej verden.

1

Hvordan har du det i dag?

2

Himlen er blå, og vinden er mild.

3

Maskinlæring hjælper computere med at lære af data.

4

Talesyntese omdanner tekst til tydelig lyd.

5

Eleverne læste en kort historie på biblioteket.

6

Toget blev forsinket på grund af sporarbejde.

7

Små modeller kører hurtigt på lokale enheder.

8

En stemmeassistent hjælper med daglige opgaver.

9

Stabil oplæsning er vigtig for både korte og lange sætninger.

Speaker 6

0

Hej verden.

1

Hvordan har du det i dag?

2

Himlen er blå, og vinden er mild.

3

Maskinlæring hjælper computere med at lære af data.

4

Talesyntese omdanner tekst til tydelig lyd.

5

Eleverne læste en kort historie på biblioteket.

6

Toget blev forsinket på grund af sporarbejde.

7

Små modeller kører hurtigt på lokale enheder.

8

En stemmeassistent hjælper med daglige opgaver.

9

Stabil oplæsning er vigtig for både korte og lange sætninger.

Speaker 7

0

Hej verden.

1

Hvordan har du det i dag?

2

Himlen er blå, og vinden er mild.

3

Maskinlæring hjælper computere med at lære af data.

4

Talesyntese omdanner tekst til tydelig lyd.

5

Eleverne læste en kort historie på biblioteket.

6

Toget blev forsinket på grund af sporarbejde.

7

Små modeller kører hurtigt på lokale enheder.

8

En stemmeassistent hjælper med daglige opgaver.

9

Stabil oplæsning er vigtig for både korte og lange sætninger.

Speaker 8

0

Hej verden.

1

Hvordan har du det i dag?

2

Himlen er blå, og vinden er mild.

3

Maskinlæring hjælper computere med at lære af data.

4

Talesyntese omdanner tekst til tydelig lyd.

5

Eleverne læste en kort historie på biblioteket.

6

Toget blev forsinket på grund af sporarbejde.

7

Små modeller kører hurtigt på lokale enheder.

8

En stemmeassistent hjælper med daglige opgaver.

9

Stabil oplæsning er vigtig for både korte og lange sætninger.

Speaker 9

0

Hej verden.

1

Hvordan har du det i dag?

2

Himlen er blå, og vinden er mild.

3

Maskinlæring hjælper computere med at lære af data.

4

Talesyntese omdanner tekst til tydelig lyd.

5

Eleverne læste en kort historie på biblioteket.

6

Toget blev forsinket på grund af sporarbejde.

7

Små modeller kører hurtigt på lokale enheder.

8

En stemmeassistent hjælper med daglige opgaver.

9

Stabil oplæsning er vigtig for både korte og lange sætninger.

Dutch

This section lists text to speech models for Dutch.

vits-piper-nl_BE-nathalie-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/nl/nl_BE/nathalie/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-nl_BE-nathalie-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-nl_BE-nathalie-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx";
  config.model.vits.tokens = "vits-piper-nl_BE-nathalie-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-nl_BE-nathalie-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "God schiep het water, maar de Nederlander schiep de dijk";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-nl_BE-nathalie-medium.tar.bz2

You can use the following code to play with vits-piper-nl_BE-nathalie-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx",
            data_dir="vits-piper-nl_BE-nathalie-medium/espeak-ng-data",
            tokens="vits-piper-nl_BE-nathalie-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="God schiep het water, maar de Nederlander schiep de dijk",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-nl_BE-nathalie-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx";
  config.model.vits.tokens = "vits-piper-nl_BE-nathalie-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-nl_BE-nathalie-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "God schiep het water, maar de Nederlander schiep de dijk";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-nl_BE-nathalie-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx".into()),
                tokens: Some("vits-piper-nl_BE-nathalie-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-nl_BE-nathalie-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "God schiep het water, maar de Nederlander schiep de dijk";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-nl_BE-nathalie-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx',
        tokens: 'vits-piper-nl_BE-nathalie-medium/tokens.txt',
        dataDir: 'vits-piper-nl_BE-nathalie-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'God schiep het water, maar de Nederlander schiep de dijk';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-nl_BE-nathalie-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx',
    tokens: 'vits-piper-nl_BE-nathalie-medium/tokens.txt',
    dataDir: 'vits-piper-nl_BE-nathalie-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'God schiep het water, maar de Nederlander schiep de dijk', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-nl_BE-nathalie-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-nl_BE-nathalie-medium/tokens.txt",
    dataDir: "vits-piper-nl_BE-nathalie-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "God schiep het water, maar de Nederlander schiep de dijk"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-nl_BE-nathalie-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-nl_BE-nathalie-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-nl_BE-nathalie-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "God schiep het water, maar de Nederlander schiep de dijk";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-nl_BE-nathalie-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx",
        tokens = "vits-piper-nl_BE-nathalie-medium/tokens.txt",
        dataDir = "vits-piper-nl_BE-nathalie-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "God schiep het water, maar de Nederlander schiep de dijk",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-nl_BE-nathalie-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx");
    vits.setTokens("vits-piper-nl_BE-nathalie-medium/tokens.txt");
    vits.setDataDir("vits-piper-nl_BE-nathalie-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "God schiep het water, maar de Nederlander schiep de dijk";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-nl_BE-nathalie-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-nl_BE-nathalie-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-nl_BE-nathalie-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('God schiep het water, maar de Nederlander schiep de dijk', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-nl_BE-nathalie-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx",
				Tokens:  "vits-piper-nl_BE-nathalie-medium/tokens.txt",
				DataDir: "vits-piper-nl_BE-nathalie-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "God schiep het water, maar de Nederlander schiep de dijk"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

God schiep het water, maar de Nederlander schiep de dijk

sample audios for different speakers are listed below:

Speaker 0

vits-piper-nl_BE-nathalie-x_low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/nl/nl_BE/nathalie/x_low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-nl_BE-nathalie-x_low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx";
  config.model.vits.tokens = "vits-piper-nl_BE-nathalie-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-nl_BE-nathalie-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "God schiep het water, maar de Nederlander schiep de dijk";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-nl_BE-nathalie-x_low.tar.bz2

You can use the following code to play with vits-piper-nl_BE-nathalie-x_low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx",
            data_dir="vits-piper-nl_BE-nathalie-x_low/espeak-ng-data",
            tokens="vits-piper-nl_BE-nathalie-x_low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="God schiep het water, maar de Nederlander schiep de dijk",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx";
  config.model.vits.tokens = "vits-piper-nl_BE-nathalie-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-nl_BE-nathalie-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "God schiep het water, maar de Nederlander schiep de dijk";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx".into()),
                tokens: Some("vits-piper-nl_BE-nathalie-x_low/tokens.txt".into()),
                data_dir: Some("vits-piper-nl_BE-nathalie-x_low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "God schiep het water, maar de Nederlander schiep de dijk";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx',
        tokens: 'vits-piper-nl_BE-nathalie-x_low/tokens.txt',
        dataDir: 'vits-piper-nl_BE-nathalie-x_low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'God schiep het water, maar de Nederlander schiep de dijk';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx',
    tokens: 'vits-piper-nl_BE-nathalie-x_low/tokens.txt',
    dataDir: 'vits-piper-nl_BE-nathalie-x_low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'God schiep het water, maar de Nederlander schiep de dijk', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx",
    lexicon: "",
    tokens: "vits-piper-nl_BE-nathalie-x_low/tokens.txt",
    dataDir: "vits-piper-nl_BE-nathalie-x_low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "God schiep het water, maar de Nederlander schiep de dijk"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-nl_BE-nathalie-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-nl_BE-nathalie-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "God schiep het water, maar de Nederlander schiep de dijk";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx",
        tokens = "vits-piper-nl_BE-nathalie-x_low/tokens.txt",
        dataDir = "vits-piper-nl_BE-nathalie-x_low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "God schiep het water, maar de Nederlander schiep de dijk",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx");
    vits.setTokens("vits-piper-nl_BE-nathalie-x_low/tokens.txt");
    vits.setDataDir("vits-piper-nl_BE-nathalie-x_low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "God schiep het water, maar de Nederlander schiep de dijk";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-nl_BE-nathalie-x_low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-nl_BE-nathalie-x_low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('God schiep het water, maar de Nederlander schiep de dijk', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx",
				Tokens:  "vits-piper-nl_BE-nathalie-x_low/tokens.txt",
				DataDir: "vits-piper-nl_BE-nathalie-x_low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "God schiep het water, maar de Nederlander schiep de dijk"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

God schiep het water, maar de Nederlander schiep de dijk

sample audios for different speakers are listed below:

Speaker 0

vits-piper-nl_NL-alex-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/nl/nl_NL/alex/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-nl_NL-alex-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-nl_NL-alex-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx";
  config.model.vits.tokens = "vits-piper-nl_NL-alex-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-nl_NL-alex-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "God schiep het water, maar de Nederlander schiep de dijk";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-nl_NL-alex-medium.tar.bz2

You can use the following code to play with vits-piper-nl_NL-alex-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx",
            data_dir="vits-piper-nl_NL-alex-medium/espeak-ng-data",
            tokens="vits-piper-nl_NL-alex-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="God schiep het water, maar de Nederlander schiep de dijk",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-nl_NL-alex-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx";
  config.model.vits.tokens = "vits-piper-nl_NL-alex-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-nl_NL-alex-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "God schiep het water, maar de Nederlander schiep de dijk";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-nl_NL-alex-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx".into()),
                tokens: Some("vits-piper-nl_NL-alex-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-nl_NL-alex-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "God schiep het water, maar de Nederlander schiep de dijk";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-nl_NL-alex-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx',
        tokens: 'vits-piper-nl_NL-alex-medium/tokens.txt',
        dataDir: 'vits-piper-nl_NL-alex-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'God schiep het water, maar de Nederlander schiep de dijk';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-nl_NL-alex-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx',
    tokens: 'vits-piper-nl_NL-alex-medium/tokens.txt',
    dataDir: 'vits-piper-nl_NL-alex-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'God schiep het water, maar de Nederlander schiep de dijk', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-nl_NL-alex-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-nl_NL-alex-medium/tokens.txt",
    dataDir: "vits-piper-nl_NL-alex-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "God schiep het water, maar de Nederlander schiep de dijk"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-nl_NL-alex-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-nl_NL-alex-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-nl_NL-alex-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "God schiep het water, maar de Nederlander schiep de dijk";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-nl_NL-alex-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx",
        tokens = "vits-piper-nl_NL-alex-medium/tokens.txt",
        dataDir = "vits-piper-nl_NL-alex-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "God schiep het water, maar de Nederlander schiep de dijk",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-nl_NL-alex-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx");
    vits.setTokens("vits-piper-nl_NL-alex-medium/tokens.txt");
    vits.setDataDir("vits-piper-nl_NL-alex-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "God schiep het water, maar de Nederlander schiep de dijk";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-nl_NL-alex-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-nl_NL-alex-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-nl_NL-alex-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('God schiep het water, maar de Nederlander schiep de dijk', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-nl_NL-alex-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx",
				Tokens:  "vits-piper-nl_NL-alex-medium/tokens.txt",
				DataDir: "vits-piper-nl_NL-alex-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "God schiep het water, maar de Nederlander schiep de dijk"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

God schiep het water, maar de Nederlander schiep de dijk

sample audios for different speakers are listed below:

Speaker 0

vits-piper-nl_NL-dii-high

Info about this model

This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_nl-NL_dii

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-nl_NL-dii-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-nl_NL-dii-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx";
  config.model.vits.tokens = "vits-piper-nl_NL-dii-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-nl_NL-dii-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "God schiep het water, maar de Nederlander schiep de dijk";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-nl_NL-dii-high.tar.bz2

You can use the following code to play with vits-piper-nl_NL-dii-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx",
            data_dir="vits-piper-nl_NL-dii-high/espeak-ng-data",
            tokens="vits-piper-nl_NL-dii-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="God schiep het water, maar de Nederlander schiep de dijk",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-nl_NL-dii-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx";
  config.model.vits.tokens = "vits-piper-nl_NL-dii-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-nl_NL-dii-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "God schiep het water, maar de Nederlander schiep de dijk";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-nl_NL-dii-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx".into()),
                tokens: Some("vits-piper-nl_NL-dii-high/tokens.txt".into()),
                data_dir: Some("vits-piper-nl_NL-dii-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "God schiep het water, maar de Nederlander schiep de dijk";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-nl_NL-dii-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx',
        tokens: 'vits-piper-nl_NL-dii-high/tokens.txt',
        dataDir: 'vits-piper-nl_NL-dii-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'God schiep het water, maar de Nederlander schiep de dijk';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-nl_NL-dii-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx',
    tokens: 'vits-piper-nl_NL-dii-high/tokens.txt',
    dataDir: 'vits-piper-nl_NL-dii-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'God schiep het water, maar de Nederlander schiep de dijk', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-nl_NL-dii-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx",
    lexicon: "",
    tokens: "vits-piper-nl_NL-dii-high/tokens.txt",
    dataDir: "vits-piper-nl_NL-dii-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "God schiep het water, maar de Nederlander schiep de dijk"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-nl_NL-dii-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx";
config.Model.Vits.Tokens = "vits-piper-nl_NL-dii-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-nl_NL-dii-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "God schiep het water, maar de Nederlander schiep de dijk";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-nl_NL-dii-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx",
        tokens = "vits-piper-nl_NL-dii-high/tokens.txt",
        dataDir = "vits-piper-nl_NL-dii-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "God schiep het water, maar de Nederlander schiep de dijk",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-nl_NL-dii-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx");
    vits.setTokens("vits-piper-nl_NL-dii-high/tokens.txt");
    vits.setDataDir("vits-piper-nl_NL-dii-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "God schiep het water, maar de Nederlander schiep de dijk";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-nl_NL-dii-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-nl_NL-dii-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-nl_NL-dii-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('God schiep het water, maar de Nederlander schiep de dijk', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-nl_NL-dii-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx",
				Tokens:  "vits-piper-nl_NL-dii-high/tokens.txt",
				DataDir: "vits-piper-nl_NL-dii-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "God schiep het water, maar de Nederlander schiep de dijk"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

God schiep het water, maar de Nederlander schiep de dijk

sample audios for different speakers are listed below:

Speaker 0

vits-piper-nl_NL-miro-high

Info about this model

This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_nl-NL_miro

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-nl_NL-miro-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-nl_NL-miro-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx";
  config.model.vits.tokens = "vits-piper-nl_NL-miro-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-nl_NL-miro-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "God schiep het water, maar de Nederlander schiep de dijk";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-nl_NL-miro-high.tar.bz2

You can use the following code to play with vits-piper-nl_NL-miro-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx",
            data_dir="vits-piper-nl_NL-miro-high/espeak-ng-data",
            tokens="vits-piper-nl_NL-miro-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="God schiep het water, maar de Nederlander schiep de dijk",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-nl_NL-miro-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx";
  config.model.vits.tokens = "vits-piper-nl_NL-miro-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-nl_NL-miro-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "God schiep het water, maar de Nederlander schiep de dijk";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-nl_NL-miro-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx".into()),
                tokens: Some("vits-piper-nl_NL-miro-high/tokens.txt".into()),
                data_dir: Some("vits-piper-nl_NL-miro-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "God schiep het water, maar de Nederlander schiep de dijk";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-nl_NL-miro-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx',
        tokens: 'vits-piper-nl_NL-miro-high/tokens.txt',
        dataDir: 'vits-piper-nl_NL-miro-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'God schiep het water, maar de Nederlander schiep de dijk';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-nl_NL-miro-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx',
    tokens: 'vits-piper-nl_NL-miro-high/tokens.txt',
    dataDir: 'vits-piper-nl_NL-miro-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'God schiep het water, maar de Nederlander schiep de dijk', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-nl_NL-miro-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx",
    lexicon: "",
    tokens: "vits-piper-nl_NL-miro-high/tokens.txt",
    dataDir: "vits-piper-nl_NL-miro-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "God schiep het water, maar de Nederlander schiep de dijk"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-nl_NL-miro-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx";
config.Model.Vits.Tokens = "vits-piper-nl_NL-miro-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-nl_NL-miro-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "God schiep het water, maar de Nederlander schiep de dijk";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-nl_NL-miro-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx",
        tokens = "vits-piper-nl_NL-miro-high/tokens.txt",
        dataDir = "vits-piper-nl_NL-miro-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "God schiep het water, maar de Nederlander schiep de dijk",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-nl_NL-miro-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx");
    vits.setTokens("vits-piper-nl_NL-miro-high/tokens.txt");
    vits.setDataDir("vits-piper-nl_NL-miro-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "God schiep het water, maar de Nederlander schiep de dijk";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-nl_NL-miro-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-nl_NL-miro-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-nl_NL-miro-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('God schiep het water, maar de Nederlander schiep de dijk', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-nl_NL-miro-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx",
				Tokens:  "vits-piper-nl_NL-miro-high/tokens.txt",
				DataDir: "vits-piper-nl_NL-miro-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "God schiep het water, maar de Nederlander schiep de dijk"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

God schiep het water, maar de Nederlander schiep de dijk

sample audios for different speakers are listed below:

Speaker 0

vits-piper-nl_NL-pim-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/nl/nl_NL/pim/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-nl_NL-pim-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-nl_NL-pim-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx";
  config.model.vits.tokens = "vits-piper-nl_NL-pim-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-nl_NL-pim-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "God schiep het water, maar de Nederlander schiep de dijk";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-nl_NL-pim-medium.tar.bz2

You can use the following code to play with vits-piper-nl_NL-pim-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx",
            data_dir="vits-piper-nl_NL-pim-medium/espeak-ng-data",
            tokens="vits-piper-nl_NL-pim-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="God schiep het water, maar de Nederlander schiep de dijk",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-nl_NL-pim-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx";
  config.model.vits.tokens = "vits-piper-nl_NL-pim-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-nl_NL-pim-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "God schiep het water, maar de Nederlander schiep de dijk";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-nl_NL-pim-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx".into()),
                tokens: Some("vits-piper-nl_NL-pim-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-nl_NL-pim-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "God schiep het water, maar de Nederlander schiep de dijk";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-nl_NL-pim-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx',
        tokens: 'vits-piper-nl_NL-pim-medium/tokens.txt',
        dataDir: 'vits-piper-nl_NL-pim-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'God schiep het water, maar de Nederlander schiep de dijk';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-nl_NL-pim-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx',
    tokens: 'vits-piper-nl_NL-pim-medium/tokens.txt',
    dataDir: 'vits-piper-nl_NL-pim-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'God schiep het water, maar de Nederlander schiep de dijk', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-nl_NL-pim-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-nl_NL-pim-medium/tokens.txt",
    dataDir: "vits-piper-nl_NL-pim-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "God schiep het water, maar de Nederlander schiep de dijk"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-nl_NL-pim-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-nl_NL-pim-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-nl_NL-pim-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "God schiep het water, maar de Nederlander schiep de dijk";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-nl_NL-pim-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx",
        tokens = "vits-piper-nl_NL-pim-medium/tokens.txt",
        dataDir = "vits-piper-nl_NL-pim-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "God schiep het water, maar de Nederlander schiep de dijk",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-nl_NL-pim-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx");
    vits.setTokens("vits-piper-nl_NL-pim-medium/tokens.txt");
    vits.setDataDir("vits-piper-nl_NL-pim-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "God schiep het water, maar de Nederlander schiep de dijk";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-nl_NL-pim-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-nl_NL-pim-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-nl_NL-pim-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('God schiep het water, maar de Nederlander schiep de dijk', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-nl_NL-pim-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx",
				Tokens:  "vits-piper-nl_NL-pim-medium/tokens.txt",
				DataDir: "vits-piper-nl_NL-pim-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "God schiep het water, maar de Nederlander schiep de dijk"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

God schiep het water, maar de Nederlander schiep de dijk

sample audios for different speakers are listed below:

Speaker 0

vits-piper-nl_NL-ronnie-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/nl/nl_NL/ronnie/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-nl_NL-ronnie-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-nl_NL-ronnie-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx";
  config.model.vits.tokens = "vits-piper-nl_NL-ronnie-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-nl_NL-ronnie-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "God schiep het water, maar de Nederlander schiep de dijk";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-nl_NL-ronnie-medium.tar.bz2

You can use the following code to play with vits-piper-nl_NL-ronnie-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx",
            data_dir="vits-piper-nl_NL-ronnie-medium/espeak-ng-data",
            tokens="vits-piper-nl_NL-ronnie-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="God schiep het water, maar de Nederlander schiep de dijk",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-nl_NL-ronnie-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx";
  config.model.vits.tokens = "vits-piper-nl_NL-ronnie-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-nl_NL-ronnie-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "God schiep het water, maar de Nederlander schiep de dijk";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-nl_NL-ronnie-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx".into()),
                tokens: Some("vits-piper-nl_NL-ronnie-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-nl_NL-ronnie-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "God schiep het water, maar de Nederlander schiep de dijk";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-nl_NL-ronnie-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx',
        tokens: 'vits-piper-nl_NL-ronnie-medium/tokens.txt',
        dataDir: 'vits-piper-nl_NL-ronnie-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'God schiep het water, maar de Nederlander schiep de dijk';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-nl_NL-ronnie-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx',
    tokens: 'vits-piper-nl_NL-ronnie-medium/tokens.txt',
    dataDir: 'vits-piper-nl_NL-ronnie-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'God schiep het water, maar de Nederlander schiep de dijk', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-nl_NL-ronnie-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-nl_NL-ronnie-medium/tokens.txt",
    dataDir: "vits-piper-nl_NL-ronnie-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "God schiep het water, maar de Nederlander schiep de dijk"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-nl_NL-ronnie-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-nl_NL-ronnie-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-nl_NL-ronnie-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "God schiep het water, maar de Nederlander schiep de dijk";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-nl_NL-ronnie-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx",
        tokens = "vits-piper-nl_NL-ronnie-medium/tokens.txt",
        dataDir = "vits-piper-nl_NL-ronnie-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "God schiep het water, maar de Nederlander schiep de dijk",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-nl_NL-ronnie-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx");
    vits.setTokens("vits-piper-nl_NL-ronnie-medium/tokens.txt");
    vits.setDataDir("vits-piper-nl_NL-ronnie-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "God schiep het water, maar de Nederlander schiep de dijk";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-nl_NL-ronnie-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-nl_NL-ronnie-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-nl_NL-ronnie-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('God schiep het water, maar de Nederlander schiep de dijk', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-nl_NL-ronnie-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx",
				Tokens:  "vits-piper-nl_NL-ronnie-medium/tokens.txt",
				DataDir: "vits-piper-nl_NL-ronnie-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "God schiep het water, maar de Nederlander schiep de dijk"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

God schiep het water, maar de Nederlander schiep de dijk

sample audios for different speakers are listed below:

Speaker 0

supertonic-3-nl

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Dutch (nl).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "nl"

audio = tts.generate("Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"nl\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "nl"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "nl"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'nl'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'nl'},
  );
  final audio = tts.generateWithConfig(text: 'Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "nl"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"nl\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "nl"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"nl\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "nl"}';

  Audio := Tts.GenerateWithConfig('Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "nl"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Hallo wereld.

1

Hoe gaat het vandaag?

2

De lucht is blauw en de wind is zacht.

3

Machine learning helpt computers om van gegevens te leren.

4

Spraaksynthese zet tekst om in duidelijke audio.

5

De leerlingen lazen een kort verhaal in de bibliotheek.

6

De trein had vertraging door onderhoud aan het spoor.

7

Kleine modellen draaien snel op lokale apparaten.

8

Een stemassistent helpt bij dagelijkse taken.

9

Stabiel voorlezen is belangrijk voor korte en lange zinnen.

Speaker 1

0

Hallo wereld.

1

Hoe gaat het vandaag?

2

De lucht is blauw en de wind is zacht.

3

Machine learning helpt computers om van gegevens te leren.

4

Spraaksynthese zet tekst om in duidelijke audio.

5

De leerlingen lazen een kort verhaal in de bibliotheek.

6

De trein had vertraging door onderhoud aan het spoor.

7

Kleine modellen draaien snel op lokale apparaten.

8

Een stemassistent helpt bij dagelijkse taken.

9

Stabiel voorlezen is belangrijk voor korte en lange zinnen.

Speaker 2

0

Hallo wereld.

1

Hoe gaat het vandaag?

2

De lucht is blauw en de wind is zacht.

3

Machine learning helpt computers om van gegevens te leren.

4

Spraaksynthese zet tekst om in duidelijke audio.

5

De leerlingen lazen een kort verhaal in de bibliotheek.

6

De trein had vertraging door onderhoud aan het spoor.

7

Kleine modellen draaien snel op lokale apparaten.

8

Een stemassistent helpt bij dagelijkse taken.

9

Stabiel voorlezen is belangrijk voor korte en lange zinnen.

Speaker 3

0

Hallo wereld.

1

Hoe gaat het vandaag?

2

De lucht is blauw en de wind is zacht.

3

Machine learning helpt computers om van gegevens te leren.

4

Spraaksynthese zet tekst om in duidelijke audio.

5

De leerlingen lazen een kort verhaal in de bibliotheek.

6

De trein had vertraging door onderhoud aan het spoor.

7

Kleine modellen draaien snel op lokale apparaten.

8

Een stemassistent helpt bij dagelijkse taken.

9

Stabiel voorlezen is belangrijk voor korte en lange zinnen.

Speaker 4

0

Hallo wereld.

1

Hoe gaat het vandaag?

2

De lucht is blauw en de wind is zacht.

3

Machine learning helpt computers om van gegevens te leren.

4

Spraaksynthese zet tekst om in duidelijke audio.

5

De leerlingen lazen een kort verhaal in de bibliotheek.

6

De trein had vertraging door onderhoud aan het spoor.

7

Kleine modellen draaien snel op lokale apparaten.

8

Een stemassistent helpt bij dagelijkse taken.

9

Stabiel voorlezen is belangrijk voor korte en lange zinnen.

Speaker 5

0

Hallo wereld.

1

Hoe gaat het vandaag?

2

De lucht is blauw en de wind is zacht.

3

Machine learning helpt computers om van gegevens te leren.

4

Spraaksynthese zet tekst om in duidelijke audio.

5

De leerlingen lazen een kort verhaal in de bibliotheek.

6

De trein had vertraging door onderhoud aan het spoor.

7

Kleine modellen draaien snel op lokale apparaten.

8

Een stemassistent helpt bij dagelijkse taken.

9

Stabiel voorlezen is belangrijk voor korte en lange zinnen.

Speaker 6

0

Hallo wereld.

1

Hoe gaat het vandaag?

2

De lucht is blauw en de wind is zacht.

3

Machine learning helpt computers om van gegevens te leren.

4

Spraaksynthese zet tekst om in duidelijke audio.

5

De leerlingen lazen een kort verhaal in de bibliotheek.

6

De trein had vertraging door onderhoud aan het spoor.

7

Kleine modellen draaien snel op lokale apparaten.

8

Een stemassistent helpt bij dagelijkse taken.

9

Stabiel voorlezen is belangrijk voor korte en lange zinnen.

Speaker 7

0

Hallo wereld.

1

Hoe gaat het vandaag?

2

De lucht is blauw en de wind is zacht.

3

Machine learning helpt computers om van gegevens te leren.

4

Spraaksynthese zet tekst om in duidelijke audio.

5

De leerlingen lazen een kort verhaal in de bibliotheek.

6

De trein had vertraging door onderhoud aan het spoor.

7

Kleine modellen draaien snel op lokale apparaten.

8

Een stemassistent helpt bij dagelijkse taken.

9

Stabiel voorlezen is belangrijk voor korte en lange zinnen.

Speaker 8

0

Hallo wereld.

1

Hoe gaat het vandaag?

2

De lucht is blauw en de wind is zacht.

3

Machine learning helpt computers om van gegevens te leren.

4

Spraaksynthese zet tekst om in duidelijke audio.

5

De leerlingen lazen een kort verhaal in de bibliotheek.

6

De trein had vertraging door onderhoud aan het spoor.

7

Kleine modellen draaien snel op lokale apparaten.

8

Een stemassistent helpt bij dagelijkse taken.

9

Stabiel voorlezen is belangrijk voor korte en lange zinnen.

Speaker 9

0

Hallo wereld.

1

Hoe gaat het vandaag?

2

De lucht is blauw en de wind is zacht.

3

Machine learning helpt computers om van gegevens te leren.

4

Spraaksynthese zet tekst om in duidelijke audio.

5

De leerlingen lazen een kort verhaal in de bibliotheek.

6

De trein had vertraging door onderhoud aan het spoor.

7

Kleine modellen draaien snel op lokale apparaten.

8

Een stemassistent helpt bij dagelijkse taken.

9

Stabiel voorlezen is belangrijk voor korte en lange zinnen.

English

This section lists text to speech models for English.

matcha-icefall-en_US-ljspeech

Info about this model

This model is trained using the code from https://github.com/k2-fsa/icefall/tree/master/egs/ljspeech/TTS/matcha

It supports only English.

Number of speakersSample rate
122050

Download the model

Click to expand

You need to download the acoustic model and the vocoder model.

Download the acoustic model

Please use the following code to download the model:

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2
tar xvf matcha-icefall-en_US-ljspeech.tar.bz2

rm matcha-icefall-en_US-ljspeech.tar.bz2

You should see the following output:

ls -lh  matcha-icefall-en_US-ljspeech/
total 144856
-rw-r--r--    1 fangjun  staff   251B Jan  2 11:05 README.md
drwxr-xr-x  122 fangjun  staff   3.8K Nov 28  2023 espeak-ng-data
-rw-r--r--@   1 fangjun  staff    71M Jan  2 04:04 model-steps-3.onnx
-rw-r--r--    1 fangjun  staff   954B Jan  2 11:05 tokens.txt

Download the vocoder model

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx

You should see the following output

ls -lh vocos-22khz-univ.onnx

-rw-r--r--@ 1 fangjun  staff    51M 17 Mar  2025 vocos-22khz-univ.onnx

Huggingface space

You can try this model by visiting https://huggingface.co/spaces/k2-fsa/text-to-speech

Huggingface space (WebAssembly, wasm)

You can try this model by visiting

https://huggingface.co/spaces/k2-fsa/web-assembly-en-tts-matcha

The source code is available at https://github.com/k2-fsa/sherpa-onnx/tree/master/wasm/tts

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

The following code shows how to use the Python API of sherpa-onnx with this model.

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        matcha=sherpa_onnx.OfflineTtsMatchaModelConfig(
            acoustic_model="matcha-icefall-en_US-ljspeech/model-steps-3.onnx",
            vocoder="vocos-22khz-univ.onnx",
            tokens="matcha-icefall-en_US-ljspeech/tokens.txt",
            data_dir="matcha-icefall-en_US-ljspeech/espeak-ng-data",
        ),
        num_threads=2,
        debug=True, # set it False to disable debug output
    ),
    max_num_sentences=1,
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."


audio = tts.generate(text, sid=0, speed=1.0)

sf.write(
    "./test.mp3",
    audio.samples,
    samplerate=audio.sample_rate,
)

You can save it as test-en.py and then run:

pip install sherpa-onnx soundfile

python3 ./test-en.py

You will get a file test.mp3 in the end.

C API

Click to expand

You can use the following code to play with matcha-icefall-en_US-ljspeech using C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.matcha.acoustic_model = "matcha-icefall-en_US-ljspeech/model-steps-3.onnx";
  config.model.matcha.vocoder = "vocos-22khz-univ.onnx";
  config.model.matcha.tokens = "matcha-icefall-en_US-ljspeech/tokens.txt";
  config.model.matcha.data_dir = "matcha-icefall-en_US-ljspeech/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-en.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-en \
  /tmp/test-en.c

Now you can run

cd /tmp

# Assume you have downloaded the acoustic model as well as the vocoder model and put them to /tmp
./test-en

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-en.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with matcha-icefall-en_US-ljspeech using C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.matcha.acoustic_model = "matcha-icefall-en_US-ljspeech/model-steps-3.onnx";
  config.model.matcha.vocoder = "vocos-22khz-univ.onnx";
  config.model.matcha.tokens = "matcha-icefall-en_US-ljspeech/tokens.txt";
  config.model.matcha.data_dir = "matcha-icefall-en_US-ljspeech/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-en.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-en \
  /tmp/test-en.cc

Now you can run

cd /tmp

# Assume you have downloaded the acoustic model as well as the vocoder model and put them to /tmp
./test-en

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-en.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with matcha-icefall-en_US-ljspeech with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsMatchaModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            matcha: OfflineTtsMatchaModelConfig {
                acoustic_model: Some("matcha-icefall-en_US-ljspeech/model-steps-3.onnx".into()),
                vocoder: Some("vocos-22khz-univ.onnx".into()),
                tokens: Some("matcha-icefall-en_US-ljspeech/tokens.txt".into()),
                data_dir: Some("matcha-icefall-en_US-ljspeech/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with matcha-icefall-en_US-ljspeech with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      matcha: {
        acousticModel: 'matcha-icefall-en_US-ljspeech/model-steps-3.onnx',
        vocoder: 'vocos-22khz-univ.onnx',
        tokens: 'matcha-icefall-en_US-ljspeech/tokens.txt',
        dataDir: 'matcha-icefall-en_US-ljspeech/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with matcha-icefall-en_US-ljspeech with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final matcha = sherpa_onnx.OfflineTtsMatchaModelConfig(
    acousticModel: 'matcha-icefall-en_US-ljspeech/model-steps-3.onnx',
    vocoder: 'vocos-22khz-univ.onnx',
    tokens: 'matcha-icefall-en_US-ljspeech/tokens.txt',
    dataDir: 'matcha-icefall-en_US-ljspeech/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    matcha: matcha,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with matcha-icefall-en_US-ljspeech with Swift API.

func run() {
  let matcha = sherpaOnnxOfflineTtsMatchaModelConfig(
    acousticModel: "matcha-icefall-en_US-ljspeech/model-steps-3.onnx",
    vocoder: "vocos-22khz-univ.onnx",
    tokens: "matcha-icefall-en_US-ljspeech/tokens.txt",
    dataDir: "matcha-icefall-en_US-ljspeech/espeak-ng-data",
    lexicon: ""
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(matcha: matcha)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with matcha-icefall-en_US-ljspeech with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Matcha.AcousticModel = "matcha-icefall-en_US-ljspeech/model-steps-3.onnx";
config.Model.Matcha.Vocoder = "vocos-22khz-univ.onnx";
config.Model.Matcha.Tokens = "matcha-icefall-en_US-ljspeech/tokens.txt";
config.Model.Matcha.DataDir = "matcha-icefall-en_US-ljspeech/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with matcha-icefall-en_US-ljspeech with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      matcha = OfflineTtsMatchaModelConfig(
        acousticModel = "matcha-icefall-en_US-ljspeech/model-steps-3.onnx",
        vocoder = "vocos-22khz-univ.onnx",
        tokens = "matcha-icefall-en_US-ljspeech/tokens.txt",
        dataDir = "matcha-icefall-en_US-ljspeech/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with matcha-icefall-en_US-ljspeech with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var matcha = new OfflineTtsMatchaModelConfig();
    matcha.setAcousticModel("matcha-icefall-en_US-ljspeech/model-steps-3.onnx");
    matcha.setVocoder("vocos-22khz-univ.onnx");
    matcha.setTokens("matcha-icefall-en_US-ljspeech/tokens.txt");
    matcha.setDataDir("matcha-icefall-en_US-ljspeech/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setMatcha(matcha);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with matcha-icefall-en_US-ljspeech with Pascal API.

program test_matcha;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Matcha.AcousticModel := 'matcha-icefall-en_US-ljspeech/model-steps-3.onnx';
  Config.Model.Matcha.Vocoder := 'vocos-22khz-univ.onnx';
  Config.Model.Matcha.Tokens := 'matcha-icefall-en_US-ljspeech/tokens.txt';
  Config.Model.Matcha.DataDir := 'matcha-icefall-en_US-ljspeech/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with matcha-icefall-en_US-ljspeech with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Matcha: sherpa.OfflineTtsMatchaModelConfig{
				AcousticModel: "matcha-icefall-en_US-ljspeech/model-steps-3.onnx",
				Vocoder:       "vocos-22khz-univ.onnx",
				Tokens:        "matcha-icefall-en_US-ljspeech/tokens.txt",
				DataDir:       "matcha-icefall-en_US-ljspeech/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

kitten-nano-en-v0_1-fp16

Info about this model

This model is kitten-tts-nano-0.1 and it is from https://huggingface.co/KittenML/kitten-tts-nano-0.1

It supports only English.

Number of speakersSample rate
824000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Meaning of speaker suffix

SuffixMeaning
fFemale
mMale

speaker ID to speaker name (sid -> name)

The mapping from speaker ID (sid) to speaker name is given below:

0 - 10 -> expr-voice-2-m1 -> expr-voice-2-f
2 - 32 -> expr-voice-3-m3 -> expr-voice-3-f
4 - 54 -> expr-voice-4-m5 -> expr-voice-4-f
6 - 76 -> expr-voice-5-m7 -> expr-voice-5-f

speaker name to speaker ID (name -> sid)

The mapping from speaker name to speaker ID (sid) is given below:

0 - 1expr-voice-2-m -> 0expr-voice-2-f -> 1
2 - 3expr-voice-3-m -> 2expr-voice-3-f -> 3
4 - 5expr-voice-4-m -> 4expr-voice-4-f -> 5
6 - 7expr-voice-5-m -> 6expr-voice-5-f -> 7

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/./kitten-nano-en-v0_1-fp16.tar.bz2

You can use the following code to play with ./kitten-nano-en-v0_1-fp16

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        kitten=sherpa_onnx.OfflineTtsKittenModelConfig(
            model="./kitten-nano-en-v0_1-fp16/model.fp16.onnx",
            voices="./kitten-nano-en-v0_1-fp16/voices.bin",
            tokens="./kitten-nano-en-v0_1-fp16/tokens.txt",
            data_dir="./kitten-nano-en-v0_1-fp16/espeak-ng-data",
        ),
        num_threads=2,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.", sid=0, speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with kitten-nano-en-v0_1-fp16 with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.kitten.model = "./kitten-nano-en-v0_1-fp16/model.fp16.onnx";
  config.model.kitten.voices = "./kitten-nano-en-v0_1-fp16/voices.bin";
  config.model.kitten.tokens = "./kitten-nano-en-v0_1-fp16/tokens.txt";
  config.model.kitten.data_dir = "./kitten-nano-en-v0_1-fp16/espeak-ng-data";

  config.model.num_threads = 1;
  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kitten.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-kitten   /tmp/test-kitten.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kitten

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kitten.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with kitten-nano-en-v0_1-fp16 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.kitten.model = "./kitten-nano-en-v0_1-fp16/model.fp16.onnx";
  config.model.kitten.voices = "./kitten-nano-en-v0_1-fp16/voices.bin";
  config.model.kitten.tokens = "./kitten-nano-en-v0_1-fp16/tokens.txt";
  config.model.kitten.data_dir = "./kitten-nano-en-v0_1-fp16/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kitten.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-kitten   /tmp/test-kitten.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kitten

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kitten.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with kitten-nano-en-v0_1-fp16 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKittenModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            kitten: OfflineTtsKittenModelConfig {
                model: Some("./kitten-nano-en-v0_1-fp16/model.fp16.onnx".into()),
                voices: Some("./kitten-nano-en-v0_1-fp16/voices.bin".into()),
                tokens: Some("./kitten-nano-en-v0_1-fp16/tokens.txt".into()),
                data_dir: Some("./kitten-nano-en-v0_1-fp16/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with kitten-nano-en-v0_1-fp16 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

async function createOfflineTtsAsync() {
  const config = {
    model: {
      kitten: {
        model: './kitten-nano-en-v0_1-fp16/model.fp16.onnx',
        voices: './kitten-nano-en-v0_1-fp16/voices.bin',
        tokens: './kitten-nano-en-v0_1-fp16/tokens.txt',
        dataDir: './kitten-nano-en-v0_1-fp16/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return await sherpa_onnx.OfflineTts.createAsync(config);
}

async function main() {
  const tts = await createOfflineTtsAsync();

  const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

  console.log('Number of speakers:', tts.numSpeakers);
  console.log('Sample rate:', tts.sampleRate);

  const start = Date.now();
  const generationConfig = new sherpa_onnx.GenerationConfig({
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  });

  const audio = await tts.generateAsync({
    text,
    generationConfig,
    onProgress({samples, progress}) {
      process.stdout.write(
          `\rGenerating... ${(progress * 100).toFixed(1)}%`);
      return true;
    },
  });

  console.log('\nGeneration finished.');

  const stop = Date.now();
  const elapsedSeconds = (stop - start) / 1000;
  const durationSeconds = audio.samples.length / audio.sampleRate;
  const realTimeFactor = elapsedSeconds / durationSeconds;

  console.log('Wave duration:', durationSeconds.toFixed(3), 'seconds');
  console.log('Elapsed time:', elapsedSeconds.toFixed(3), 'seconds');
  console.log(
      `RTF = ${elapsedSeconds.toFixed(3)} / ${durationSeconds.toFixed(3)} =`,
      realTimeFactor.toFixed(3));

  const filename = 'test.wav';
  sherpa_onnx.writeWave(filename, {
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  });

  console.log(`Saved to ${filename}`);
}

main().catch((err) => {
  console.error('TTS failed:', err);
  process.exit(1);
});

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with kitten-nano-en-v0_1-fp16 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final kitten = sherpa_onnx.OfflineTtsKittenModelConfig(
    model: './kitten-nano-en-v0_1-fp16/model.fp16.onnx',
    voices: './kitten-nano-en-v0_1-fp16/voices.bin',
    tokens: './kitten-nano-en-v0_1-fp16/tokens.txt',
    dataDir: './kitten-nano-en-v0_1-fp16/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    kitten: kitten,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with kitten-nano-en-v0_1-fp16 with Swift API.

func run() {
  let kitten = sherpaOnnxOfflineTtsKittenModelConfig(
    model: "./kitten-nano-en-v0_1-fp16/model.fp16.onnx",
    voices: "./kitten-nano-en-v0_1-fp16/voices.bin",
    tokens: "./kitten-nano-en-v0_1-fp16/tokens.txt",
    dataDir: "./kitten-nano-en-v0_1-fp16/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(kitten: kitten)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with kitten-nano-en-v0_1-fp16 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Kitten.Model = "./kitten-nano-en-v0_1-fp16/model.fp16.onnx";
config.Model.Kitten.Voices = "./kitten-nano-en-v0_1-fp16/voices.bin";
config.Model.Kitten.Tokens = "./kitten-nano-en-v0_1-fp16/tokens.txt";
config.Model.Kitten.DataDir = "./kitten-nano-en-v0_1-fp16/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with kitten-nano-en-v0_1-fp16 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      kitten = OfflineTtsKittenModelConfig(
        model = "./kitten-nano-en-v0_1-fp16/model.fp16.onnx",
        voices = "./kitten-nano-en-v0_1-fp16/voices.bin",
        tokens = "./kitten-nano-en-v0_1-fp16/tokens.txt",
        dataDir = "./kitten-nano-en-v0_1-fp16/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with kitten-nano-en-v0_1-fp16 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var kitten = new OfflineTtsKittenModelConfig();
    kitten.setModel("./kitten-nano-en-v0_1-fp16/model.fp16.onnx");
    kitten.setVoices("./kitten-nano-en-v0_1-fp16/voices.bin");
    kitten.setTokens("./kitten-nano-en-v0_1-fp16/tokens.txt");
    kitten.setDataDir("./kitten-nano-en-v0_1-fp16/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setKitten(kitten);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with kitten-nano-en-v0_1-fp16 with Pascal API.

program test_kitten;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Kitten.Model := './kitten-nano-en-v0_1-fp16/model.fp16.onnx';
  Config.Model.Kitten.Voices := './kitten-nano-en-v0_1-fp16/voices.bin';
  Config.Model.Kitten.Tokens := './kitten-nano-en-v0_1-fp16/tokens.txt';
  Config.Model.Kitten.DataDir := './kitten-nano-en-v0_1-fp16/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with kitten-nano-en-v0_1-fp16 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Kitten: sherpa.OfflineTtsKittenModelConfig{
				Model:   "./kitten-nano-en-v0_1-fp16/model.fp16.onnx",
				Voices:  "./kitten-nano-en-v0_1-fp16/voices.bin",
				Tokens:  "./kitten-nano-en-v0_1-fp16/tokens.txt",
				DataDir: "./kitten-nano-en-v0_1-fp16/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0 - expr-voice-2-m

Speaker 1 - expr-voice-2-f

Speaker 2 - expr-voice-3-m

Speaker 3 - expr-voice-3-f

Speaker 4 - expr-voice-4-m

Speaker 5 - expr-voice-4-f

Speaker 6 - expr-voice-5-m

Speaker 7 - expr-voice-5-f

kitten-nano-en-v0_2-fp16

Info about this model

This model is kitten-tts-nano-0.2 and it is from https://huggingface.co/KittenML/kitten-tts-nano-0.2

It supports only English.

Number of speakersSample rate
824000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_2-fp16.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Meaning of speaker suffix

SuffixMeaning
fFemale
mMale

speaker ID to speaker name (sid -> name)

The mapping from speaker ID (sid) to speaker name is given below:

0 - 10 -> expr-voice-2-m1 -> expr-voice-2-f
2 - 32 -> expr-voice-3-m3 -> expr-voice-3-f
4 - 54 -> expr-voice-4-m5 -> expr-voice-4-f
6 - 76 -> expr-voice-5-m7 -> expr-voice-5-f

speaker name to speaker ID (name -> sid)

The mapping from speaker name to speaker ID (sid) is given below:

0 - 1expr-voice-2-m -> 0expr-voice-2-f -> 1
2 - 3expr-voice-3-m -> 2expr-voice-3-f -> 3
4 - 5expr-voice-4-m -> 4expr-voice-4-f -> 5
6 - 7expr-voice-5-m -> 6expr-voice-5-f -> 7

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/./kitten-nano-en-v0_2-fp16.tar.bz2

You can use the following code to play with ./kitten-nano-en-v0_2-fp16

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        kitten=sherpa_onnx.OfflineTtsKittenModelConfig(
            model="./kitten-nano-en-v0_2-fp16/model.fp16.onnx",
            voices="./kitten-nano-en-v0_2-fp16/voices.bin",
            tokens="./kitten-nano-en-v0_2-fp16/tokens.txt",
            data_dir="./kitten-nano-en-v0_2-fp16/espeak-ng-data",
        ),
        num_threads=2,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.", sid=0, speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with kitten-nano-en-v0_2-fp16 with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.kitten.model = "./kitten-nano-en-v0_2-fp16/model.fp16.onnx";
  config.model.kitten.voices = "./kitten-nano-en-v0_2-fp16/voices.bin";
  config.model.kitten.tokens = "./kitten-nano-en-v0_2-fp16/tokens.txt";
  config.model.kitten.data_dir = "./kitten-nano-en-v0_2-fp16/espeak-ng-data";

  config.model.num_threads = 1;
  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kitten.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-kitten   /tmp/test-kitten.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kitten

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kitten.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with kitten-nano-en-v0_2-fp16 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.kitten.model = "./kitten-nano-en-v0_2-fp16/model.fp16.onnx";
  config.model.kitten.voices = "./kitten-nano-en-v0_2-fp16/voices.bin";
  config.model.kitten.tokens = "./kitten-nano-en-v0_2-fp16/tokens.txt";
  config.model.kitten.data_dir = "./kitten-nano-en-v0_2-fp16/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kitten.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-kitten   /tmp/test-kitten.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kitten

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kitten.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with kitten-nano-en-v0_2-fp16 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKittenModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            kitten: OfflineTtsKittenModelConfig {
                model: Some("./kitten-nano-en-v0_2-fp16/model.fp16.onnx".into()),
                voices: Some("./kitten-nano-en-v0_2-fp16/voices.bin".into()),
                tokens: Some("./kitten-nano-en-v0_2-fp16/tokens.txt".into()),
                data_dir: Some("./kitten-nano-en-v0_2-fp16/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with kitten-nano-en-v0_2-fp16 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

async function createOfflineTtsAsync() {
  const config = {
    model: {
      kitten: {
        model: './kitten-nano-en-v0_2-fp16/model.fp16.onnx',
        voices: './kitten-nano-en-v0_2-fp16/voices.bin',
        tokens: './kitten-nano-en-v0_2-fp16/tokens.txt',
        dataDir: './kitten-nano-en-v0_2-fp16/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return await sherpa_onnx.OfflineTts.createAsync(config);
}

async function main() {
  const tts = await createOfflineTtsAsync();

  const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

  console.log('Number of speakers:', tts.numSpeakers);
  console.log('Sample rate:', tts.sampleRate);

  const start = Date.now();
  const generationConfig = new sherpa_onnx.GenerationConfig({
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  });

  const audio = await tts.generateAsync({
    text,
    generationConfig,
    onProgress({samples, progress}) {
      process.stdout.write(
          `\rGenerating... ${(progress * 100).toFixed(1)}%`);
      return true;
    },
  });

  console.log('\nGeneration finished.');

  const stop = Date.now();
  const elapsedSeconds = (stop - start) / 1000;
  const durationSeconds = audio.samples.length / audio.sampleRate;
  const realTimeFactor = elapsedSeconds / durationSeconds;

  console.log('Wave duration:', durationSeconds.toFixed(3), 'seconds');
  console.log('Elapsed time:', elapsedSeconds.toFixed(3), 'seconds');
  console.log(
      `RTF = ${elapsedSeconds.toFixed(3)} / ${durationSeconds.toFixed(3)} =`,
      realTimeFactor.toFixed(3));

  const filename = 'test.wav';
  sherpa_onnx.writeWave(filename, {
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  });

  console.log(`Saved to ${filename}`);
}

main().catch((err) => {
  console.error('TTS failed:', err);
  process.exit(1);
});

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with kitten-nano-en-v0_2-fp16 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final kitten = sherpa_onnx.OfflineTtsKittenModelConfig(
    model: './kitten-nano-en-v0_2-fp16/model.fp16.onnx',
    voices: './kitten-nano-en-v0_2-fp16/voices.bin',
    tokens: './kitten-nano-en-v0_2-fp16/tokens.txt',
    dataDir: './kitten-nano-en-v0_2-fp16/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    kitten: kitten,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with kitten-nano-en-v0_2-fp16 with Swift API.

func run() {
  let kitten = sherpaOnnxOfflineTtsKittenModelConfig(
    model: "./kitten-nano-en-v0_2-fp16/model.fp16.onnx",
    voices: "./kitten-nano-en-v0_2-fp16/voices.bin",
    tokens: "./kitten-nano-en-v0_2-fp16/tokens.txt",
    dataDir: "./kitten-nano-en-v0_2-fp16/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(kitten: kitten)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with kitten-nano-en-v0_2-fp16 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Kitten.Model = "./kitten-nano-en-v0_2-fp16/model.fp16.onnx";
config.Model.Kitten.Voices = "./kitten-nano-en-v0_2-fp16/voices.bin";
config.Model.Kitten.Tokens = "./kitten-nano-en-v0_2-fp16/tokens.txt";
config.Model.Kitten.DataDir = "./kitten-nano-en-v0_2-fp16/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with kitten-nano-en-v0_2-fp16 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      kitten = OfflineTtsKittenModelConfig(
        model = "./kitten-nano-en-v0_2-fp16/model.fp16.onnx",
        voices = "./kitten-nano-en-v0_2-fp16/voices.bin",
        tokens = "./kitten-nano-en-v0_2-fp16/tokens.txt",
        dataDir = "./kitten-nano-en-v0_2-fp16/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with kitten-nano-en-v0_2-fp16 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var kitten = new OfflineTtsKittenModelConfig();
    kitten.setModel("./kitten-nano-en-v0_2-fp16/model.fp16.onnx");
    kitten.setVoices("./kitten-nano-en-v0_2-fp16/voices.bin");
    kitten.setTokens("./kitten-nano-en-v0_2-fp16/tokens.txt");
    kitten.setDataDir("./kitten-nano-en-v0_2-fp16/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setKitten(kitten);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with kitten-nano-en-v0_2-fp16 with Pascal API.

program test_kitten;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Kitten.Model := './kitten-nano-en-v0_2-fp16/model.fp16.onnx';
  Config.Model.Kitten.Voices := './kitten-nano-en-v0_2-fp16/voices.bin';
  Config.Model.Kitten.Tokens := './kitten-nano-en-v0_2-fp16/tokens.txt';
  Config.Model.Kitten.DataDir := './kitten-nano-en-v0_2-fp16/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with kitten-nano-en-v0_2-fp16 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Kitten: sherpa.OfflineTtsKittenModelConfig{
				Model:   "./kitten-nano-en-v0_2-fp16/model.fp16.onnx",
				Voices:  "./kitten-nano-en-v0_2-fp16/voices.bin",
				Tokens:  "./kitten-nano-en-v0_2-fp16/tokens.txt",
				DataDir: "./kitten-nano-en-v0_2-fp16/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0 - expr-voice-2-m

Speaker 1 - expr-voice-2-f

Speaker 2 - expr-voice-3-m

Speaker 3 - expr-voice-3-f

Speaker 4 - expr-voice-4-m

Speaker 5 - expr-voice-4-f

Speaker 6 - expr-voice-5-m

Speaker 7 - expr-voice-5-f

kitten-mini-en-v0_1-fp16

Info about this model

This model is kitten-tts-mini-0.1 and it is from https://huggingface.co/KittenML/kitten-tts-mini-0.1

It supports only English.

Number of speakersSample rate
824000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-mini-en-v0_1-fp16.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Meaning of speaker suffix

SuffixMeaning
fFemale
mMale

speaker ID to speaker name (sid -> name)

The mapping from speaker ID (sid) to speaker name is given below:

0 - 10 -> expr-voice-2-m1 -> expr-voice-2-f
2 - 32 -> expr-voice-3-m3 -> expr-voice-3-f
4 - 54 -> expr-voice-4-m5 -> expr-voice-4-f
6 - 76 -> expr-voice-5-m7 -> expr-voice-5-f

speaker name to speaker ID (name -> sid)

The mapping from speaker name to speaker ID (sid) is given below:

0 - 1expr-voice-2-m -> 0expr-voice-2-f -> 1
2 - 3expr-voice-3-m -> 2expr-voice-3-f -> 3
4 - 5expr-voice-4-m -> 4expr-voice-4-f -> 5
6 - 7expr-voice-5-m -> 6expr-voice-5-f -> 7

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/./kitten-mini-en-v0_1-fp16.tar.bz2

You can use the following code to play with ./kitten-mini-en-v0_1-fp16

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        kitten=sherpa_onnx.OfflineTtsKittenModelConfig(
            model="./kitten-mini-en-v0_1-fp16/model.fp16.onnx",
            voices="./kitten-mini-en-v0_1-fp16/voices.bin",
            tokens="./kitten-mini-en-v0_1-fp16/tokens.txt",
            data_dir="./kitten-mini-en-v0_1-fp16/espeak-ng-data",
        ),
        num_threads=2,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.", sid=0, speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with kitten-mini-en-v0_1-fp16 with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.kitten.model = "./kitten-mini-en-v0_1-fp16/model.fp16.onnx";
  config.model.kitten.voices = "./kitten-mini-en-v0_1-fp16/voices.bin";
  config.model.kitten.tokens = "./kitten-mini-en-v0_1-fp16/tokens.txt";
  config.model.kitten.data_dir = "./kitten-mini-en-v0_1-fp16/espeak-ng-data";

  config.model.num_threads = 1;
  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kitten.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-kitten   /tmp/test-kitten.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kitten

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kitten.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with kitten-mini-en-v0_1-fp16 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.kitten.model = "./kitten-mini-en-v0_1-fp16/model.fp16.onnx";
  config.model.kitten.voices = "./kitten-mini-en-v0_1-fp16/voices.bin";
  config.model.kitten.tokens = "./kitten-mini-en-v0_1-fp16/tokens.txt";
  config.model.kitten.data_dir = "./kitten-mini-en-v0_1-fp16/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kitten.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-kitten   /tmp/test-kitten.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kitten

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kitten.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with kitten-mini-en-v0_1-fp16 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKittenModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            kitten: OfflineTtsKittenModelConfig {
                model: Some("./kitten-mini-en-v0_1-fp16/model.fp16.onnx".into()),
                voices: Some("./kitten-mini-en-v0_1-fp16/voices.bin".into()),
                tokens: Some("./kitten-mini-en-v0_1-fp16/tokens.txt".into()),
                data_dir: Some("./kitten-mini-en-v0_1-fp16/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with kitten-mini-en-v0_1-fp16 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

async function createOfflineTtsAsync() {
  const config = {
    model: {
      kitten: {
        model: './kitten-mini-en-v0_1-fp16/model.fp16.onnx',
        voices: './kitten-mini-en-v0_1-fp16/voices.bin',
        tokens: './kitten-mini-en-v0_1-fp16/tokens.txt',
        dataDir: './kitten-mini-en-v0_1-fp16/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return await sherpa_onnx.OfflineTts.createAsync(config);
}

async function main() {
  const tts = await createOfflineTtsAsync();

  const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

  console.log('Number of speakers:', tts.numSpeakers);
  console.log('Sample rate:', tts.sampleRate);

  const start = Date.now();
  const generationConfig = new sherpa_onnx.GenerationConfig({
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  });

  const audio = await tts.generateAsync({
    text,
    generationConfig,
    onProgress({samples, progress}) {
      process.stdout.write(
          `\rGenerating... ${(progress * 100).toFixed(1)}%`);
      return true;
    },
  });

  console.log('\nGeneration finished.');

  const stop = Date.now();
  const elapsedSeconds = (stop - start) / 1000;
  const durationSeconds = audio.samples.length / audio.sampleRate;
  const realTimeFactor = elapsedSeconds / durationSeconds;

  console.log('Wave duration:', durationSeconds.toFixed(3), 'seconds');
  console.log('Elapsed time:', elapsedSeconds.toFixed(3), 'seconds');
  console.log(
      `RTF = ${elapsedSeconds.toFixed(3)} / ${durationSeconds.toFixed(3)} =`,
      realTimeFactor.toFixed(3));

  const filename = 'test.wav';
  sherpa_onnx.writeWave(filename, {
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  });

  console.log(`Saved to ${filename}`);
}

main().catch((err) => {
  console.error('TTS failed:', err);
  process.exit(1);
});

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with kitten-mini-en-v0_1-fp16 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final kitten = sherpa_onnx.OfflineTtsKittenModelConfig(
    model: './kitten-mini-en-v0_1-fp16/model.fp16.onnx',
    voices: './kitten-mini-en-v0_1-fp16/voices.bin',
    tokens: './kitten-mini-en-v0_1-fp16/tokens.txt',
    dataDir: './kitten-mini-en-v0_1-fp16/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    kitten: kitten,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with kitten-mini-en-v0_1-fp16 with Swift API.

func run() {
  let kitten = sherpaOnnxOfflineTtsKittenModelConfig(
    model: "./kitten-mini-en-v0_1-fp16/model.fp16.onnx",
    voices: "./kitten-mini-en-v0_1-fp16/voices.bin",
    tokens: "./kitten-mini-en-v0_1-fp16/tokens.txt",
    dataDir: "./kitten-mini-en-v0_1-fp16/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(kitten: kitten)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with kitten-mini-en-v0_1-fp16 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Kitten.Model = "./kitten-mini-en-v0_1-fp16/model.fp16.onnx";
config.Model.Kitten.Voices = "./kitten-mini-en-v0_1-fp16/voices.bin";
config.Model.Kitten.Tokens = "./kitten-mini-en-v0_1-fp16/tokens.txt";
config.Model.Kitten.DataDir = "./kitten-mini-en-v0_1-fp16/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with kitten-mini-en-v0_1-fp16 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      kitten = OfflineTtsKittenModelConfig(
        model = "./kitten-mini-en-v0_1-fp16/model.fp16.onnx",
        voices = "./kitten-mini-en-v0_1-fp16/voices.bin",
        tokens = "./kitten-mini-en-v0_1-fp16/tokens.txt",
        dataDir = "./kitten-mini-en-v0_1-fp16/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with kitten-mini-en-v0_1-fp16 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var kitten = new OfflineTtsKittenModelConfig();
    kitten.setModel("./kitten-mini-en-v0_1-fp16/model.fp16.onnx");
    kitten.setVoices("./kitten-mini-en-v0_1-fp16/voices.bin");
    kitten.setTokens("./kitten-mini-en-v0_1-fp16/tokens.txt");
    kitten.setDataDir("./kitten-mini-en-v0_1-fp16/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setKitten(kitten);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with kitten-mini-en-v0_1-fp16 with Pascal API.

program test_kitten;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Kitten.Model := './kitten-mini-en-v0_1-fp16/model.fp16.onnx';
  Config.Model.Kitten.Voices := './kitten-mini-en-v0_1-fp16/voices.bin';
  Config.Model.Kitten.Tokens := './kitten-mini-en-v0_1-fp16/tokens.txt';
  Config.Model.Kitten.DataDir := './kitten-mini-en-v0_1-fp16/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with kitten-mini-en-v0_1-fp16 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Kitten: sherpa.OfflineTtsKittenModelConfig{
				Model:   "./kitten-mini-en-v0_1-fp16/model.fp16.onnx",
				Voices:  "./kitten-mini-en-v0_1-fp16/voices.bin",
				Tokens:  "./kitten-mini-en-v0_1-fp16/tokens.txt",
				DataDir: "./kitten-mini-en-v0_1-fp16/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0 - expr-voice-2-m

Speaker 1 - expr-voice-2-f

Speaker 2 - expr-voice-3-m

Speaker 3 - expr-voice-3-f

Speaker 4 - expr-voice-4-m

Speaker 5 - expr-voice-4-f

Speaker 6 - expr-voice-5-m

Speaker 7 - expr-voice-5-f

kitten-nano-en-v0_8-fp32

Info about this model

This model is kitten-tts-nano-0.8 and it is from https://huggingface.co/KittenML/kitten-tts-nano-0.8

It supports only English.

Number of speakersSample rate
824000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_8-fp32.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Meaning of speaker suffix

SuffixMeaning
fFemale
mMale

speaker ID to speaker name (sid -> name)

The mapping from speaker ID (sid) to speaker name is given below:

0 - 10 -> expr-voice-2-m1 -> expr-voice-2-f
2 - 32 -> expr-voice-3-m3 -> expr-voice-3-f
4 - 54 -> expr-voice-4-m5 -> expr-voice-4-f
6 - 76 -> expr-voice-5-m7 -> expr-voice-5-f

speaker name to speaker ID (name -> sid)

The mapping from speaker name to speaker ID (sid) is given below:

0 - 1expr-voice-2-m -> 0expr-voice-2-f -> 1
2 - 3expr-voice-3-m -> 2expr-voice-3-f -> 3
4 - 5expr-voice-4-m -> 4expr-voice-4-f -> 5
6 - 7expr-voice-5-m -> 6expr-voice-5-f -> 7

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_8-fp32.tar.bz2

You can use the following code to play with kitten-nano-en-v0_8-fp32

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        kitten=sherpa_onnx.OfflineTtsKittenModelConfig(
            model="./kitten-nano-en-v0_8-fp32/model.fp32.onnx",
            voices="./kitten-nano-en-v0_8-fp32/voices.bin",
            tokens="./kitten-nano-en-v0_8-fp32/tokens.txt",
            data_dir="./kitten-nano-en-v0_8-fp32/espeak-ng-data",
        ),
        num_threads=2,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.", sid=0, speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with kitten-nano-en-v0_8-fp32 with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.kitten.model = "./kitten-nano-en-v0_8-fp32/model.fp32.onnx";
  config.model.kitten.voices = "./kitten-nano-en-v0_8-fp32/voices.bin";
  config.model.kitten.tokens = "./kitten-nano-en-v0_8-fp32/tokens.txt";
  config.model.kitten.data_dir = "./kitten-nano-en-v0_8-fp32/espeak-ng-data";

  config.model.num_threads = 1;
  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kitten.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-kitten \
  /tmp/test-kitten.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kitten

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kitten.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with kitten-nano-en-v0_8-fp32 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.kitten.model = "./kitten-nano-en-v0_8-fp32/model.fp32.onnx";
  config.model.kitten.voices = "./kitten-nano-en-v0_8-fp32/voices.bin";
  config.model.kitten.tokens = "./kitten-nano-en-v0_8-fp32/tokens.txt";
  config.model.kitten.data_dir = "./kitten-nano-en-v0_8-fp32/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kitten.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-kitten \
  /tmp/test-kitten.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kitten

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kitten.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with kitten-nano-en-v0_8-fp32 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKittenModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            kitten: OfflineTtsKittenModelConfig {
                model: Some("./kitten-nano-en-v0_8-fp32/model.fp32.onnx".into()),
                voices: Some("./kitten-nano-en-v0_8-fp32/voices.bin".into()),
                tokens: Some("./kitten-nano-en-v0_8-fp32/tokens.txt".into()),
                data_dir: Some("./kitten-nano-en-v0_8-fp32/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with kitten-nano-en-v0_8-fp32 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

async function createOfflineTtsAsync() {
  const config = {
    model: {
      kitten: {
        model: './kitten-nano-en-v0_8-fp32/model.fp32.onnx',
        voices: './kitten-nano-en-v0_8-fp32/voices.bin',
        tokens: './kitten-nano-en-v0_8-fp32/tokens.txt',
        dataDir: './kitten-nano-en-v0_8-fp32/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return await sherpa_onnx.OfflineTts.createAsync(config);
}

async function main() {
  const tts = await createOfflineTtsAsync();

  const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

  console.log('Number of speakers:', tts.numSpeakers);
  console.log('Sample rate:', tts.sampleRate);

  const start = Date.now();
  const generationConfig = new sherpa_onnx.GenerationConfig({
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  });

  const audio = await tts.generateAsync({
    text,
    generationConfig,
    onProgress({samples, progress}) {
      process.stdout.write(
          `\rGenerating... ${(progress * 100).toFixed(1)}%`);
      return true;
    },
  });

  console.log('\nGeneration finished.');

  const stop = Date.now();
  const elapsedSeconds = (stop - start) / 1000;
  const durationSeconds = audio.samples.length / audio.sampleRate;
  const realTimeFactor = elapsedSeconds / durationSeconds;

  console.log('Wave duration:', durationSeconds.toFixed(3), 'seconds');
  console.log('Elapsed time:', elapsedSeconds.toFixed(3), 'seconds');
  console.log(
      `RTF = ${elapsedSeconds.toFixed(3)} / ${durationSeconds.toFixed(3)} =`,
      realTimeFactor.toFixed(3));

  const filename = 'test.wav';
  sherpa_onnx.writeWave(filename, {
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  });

  console.log(`Saved to ${filename}`);
}

main().catch((err) => {
  console.error('TTS failed:', err);
  process.exit(1);
});

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with kitten-nano-en-v0_8-fp32 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final kitten = sherpa_onnx.OfflineTtsKittenModelConfig(
    model: './kitten-nano-en-v0_8-fp32/model.fp32.onnx',
    voices: './kitten-nano-en-v0_8-fp32/voices.bin',
    tokens: './kitten-nano-en-v0_8-fp32/tokens.txt',
    dataDir: './kitten-nano-en-v0_8-fp32/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    kitten: kitten,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with kitten-nano-en-v0_8-fp32 with Swift API.

func run() {
  let kitten = sherpaOnnxOfflineTtsKittenModelConfig(
    model: "./kitten-nano-en-v0_8-fp32/model.fp32.onnx",
    voices: "./kitten-nano-en-v0_8-fp32/voices.bin",
    tokens: "./kitten-nano-en-v0_8-fp32/tokens.txt",
    dataDir: "./kitten-nano-en-v0_8-fp32/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(kitten: kitten)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with kitten-nano-en-v0_8-fp32 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Kitten.Model = "./kitten-nano-en-v0_8-fp32/model.fp32.onnx";
config.Model.Kitten.Voices = "./kitten-nano-en-v0_8-fp32/voices.bin";
config.Model.Kitten.Tokens = "./kitten-nano-en-v0_8-fp32/tokens.txt";
config.Model.Kitten.DataDir = "./kitten-nano-en-v0_8-fp32/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with kitten-nano-en-v0_8-fp32 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      kitten = OfflineTtsKittenModelConfig(
        model = "./kitten-nano-en-v0_8-fp32/model.fp32.onnx",
        voices = "./kitten-nano-en-v0_8-fp32/voices.bin",
        tokens = "./kitten-nano-en-v0_8-fp32/tokens.txt",
        dataDir = "./kitten-nano-en-v0_8-fp32/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with kitten-nano-en-v0_8-fp32 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var kitten = new OfflineTtsKittenModelConfig();
    kitten.setModel("./kitten-nano-en-v0_8-fp32/model.fp32.onnx");
    kitten.setVoices("./kitten-nano-en-v0_8-fp32/voices.bin");
    kitten.setTokens("./kitten-nano-en-v0_8-fp32/tokens.txt");
    kitten.setDataDir("./kitten-nano-en-v0_8-fp32/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setKitten(kitten);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with kitten-nano-en-v0_8-fp32 with Pascal API.

program test_kitten;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Kitten.Model := './kitten-nano-en-v0_8-fp32/model.fp32.onnx';
  Config.Model.Kitten.Voices := './kitten-nano-en-v0_8-fp32/voices.bin';
  Config.Model.Kitten.Tokens := './kitten-nano-en-v0_8-fp32/tokens.txt';
  Config.Model.Kitten.DataDir := './kitten-nano-en-v0_8-fp32/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with kitten-nano-en-v0_8-fp32 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Kitten: sherpa.OfflineTtsKittenModelConfig{
				Model:   "./kitten-nano-en-v0_8-fp32/model.fp32.onnx",
				Voices:  "./kitten-nano-en-v0_8-fp32/voices.bin",
				Tokens:  "./kitten-nano-en-v0_8-fp32/tokens.txt",
				DataDir: "./kitten-nano-en-v0_8-fp32/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0 - expr-voice-2-m

Speaker 1 - expr-voice-2-f

Speaker 2 - expr-voice-3-m

Speaker 3 - expr-voice-3-f

Speaker 4 - expr-voice-4-m

Speaker 5 - expr-voice-4-f

Speaker 6 - expr-voice-5-m

Speaker 7 - expr-voice-5-f

kitten-nano-en-v0_8-int8

Info about this model

This model is kitten-tts-nano-0.8 and it is from https://huggingface.co/KittenML/kitten-tts-nano-0.8

It supports only English.

Number of speakersSample rate
824000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_8-int8.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Meaning of speaker suffix

SuffixMeaning
fFemale
mMale

speaker ID to speaker name (sid -> name)

The mapping from speaker ID (sid) to speaker name is given below:

0 - 10 -> expr-voice-2-m1 -> expr-voice-2-f
2 - 32 -> expr-voice-3-m3 -> expr-voice-3-f
4 - 54 -> expr-voice-4-m5 -> expr-voice-4-f
6 - 76 -> expr-voice-5-m7 -> expr-voice-5-f

speaker name to speaker ID (name -> sid)

The mapping from speaker name to speaker ID (sid) is given below:

0 - 1expr-voice-2-m -> 0expr-voice-2-f -> 1
2 - 3expr-voice-3-m -> 2expr-voice-3-f -> 3
4 - 5expr-voice-4-m -> 4expr-voice-4-f -> 5
6 - 7expr-voice-5-m -> 6expr-voice-5-f -> 7

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_8-int8.tar.bz2

You can use the following code to play with kitten-nano-en-v0_8-int8

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        kitten=sherpa_onnx.OfflineTtsKittenModelConfig(
            model="./kitten-nano-en-v0_8-int8/model.int8.onnx",
            voices="./kitten-nano-en-v0_8-int8/voices.bin",
            tokens="./kitten-nano-en-v0_8-int8/tokens.txt",
            data_dir="./kitten-nano-en-v0_8-int8/espeak-ng-data",
        ),
        num_threads=2,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.", sid=0, speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with kitten-nano-en-v0_8-int8 with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.kitten.model = "./kitten-nano-en-v0_8-int8/model.int8.onnx";
  config.model.kitten.voices = "./kitten-nano-en-v0_8-int8/voices.bin";
  config.model.kitten.tokens = "./kitten-nano-en-v0_8-int8/tokens.txt";
  config.model.kitten.data_dir = "./kitten-nano-en-v0_8-int8/espeak-ng-data";

  config.model.num_threads = 1;
  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kitten.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-kitten \
  /tmp/test-kitten.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kitten

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kitten.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with kitten-nano-en-v0_8-int8 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.kitten.model = "./kitten-nano-en-v0_8-int8/model.int8.onnx";
  config.model.kitten.voices = "./kitten-nano-en-v0_8-int8/voices.bin";
  config.model.kitten.tokens = "./kitten-nano-en-v0_8-int8/tokens.txt";
  config.model.kitten.data_dir = "./kitten-nano-en-v0_8-int8/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kitten.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-kitten \
  /tmp/test-kitten.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kitten

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kitten.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with kitten-nano-en-v0_8-int8 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKittenModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            kitten: OfflineTtsKittenModelConfig {
                model: Some("./kitten-nano-en-v0_8-int8/model.int8.onnx".into()),
                voices: Some("./kitten-nano-en-v0_8-int8/voices.bin".into()),
                tokens: Some("./kitten-nano-en-v0_8-int8/tokens.txt".into()),
                data_dir: Some("./kitten-nano-en-v0_8-int8/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with kitten-nano-en-v0_8-int8 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

async function createOfflineTtsAsync() {
  const config = {
    model: {
      kitten: {
        model: './kitten-nano-en-v0_8-int8/model.int8.onnx',
        voices: './kitten-nano-en-v0_8-int8/voices.bin',
        tokens: './kitten-nano-en-v0_8-int8/tokens.txt',
        dataDir: './kitten-nano-en-v0_8-int8/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return await sherpa_onnx.OfflineTts.createAsync(config);
}

async function main() {
  const tts = await createOfflineTtsAsync();

  const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

  console.log('Number of speakers:', tts.numSpeakers);
  console.log('Sample rate:', tts.sampleRate);

  const start = Date.now();
  const generationConfig = new sherpa_onnx.GenerationConfig({
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  });

  const audio = await tts.generateAsync({
    text,
    generationConfig,
    onProgress({samples, progress}) {
      process.stdout.write(
          `\rGenerating... ${(progress * 100).toFixed(1)}%`);
      return true;
    },
  });

  console.log('\nGeneration finished.');

  const stop = Date.now();
  const elapsedSeconds = (stop - start) / 1000;
  const durationSeconds = audio.samples.length / audio.sampleRate;
  const realTimeFactor = elapsedSeconds / durationSeconds;

  console.log('Wave duration:', durationSeconds.toFixed(3), 'seconds');
  console.log('Elapsed time:', elapsedSeconds.toFixed(3), 'seconds');
  console.log(
      `RTF = ${elapsedSeconds.toFixed(3)} / ${durationSeconds.toFixed(3)} =`,
      realTimeFactor.toFixed(3));

  const filename = 'test.wav';
  sherpa_onnx.writeWave(filename, {
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  });

  console.log(`Saved to ${filename}`);
}

main().catch((err) => {
  console.error('TTS failed:', err);
  process.exit(1);
});

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with kitten-nano-en-v0_8-int8 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final kitten = sherpa_onnx.OfflineTtsKittenModelConfig(
    model: './kitten-nano-en-v0_8-int8/model.int8.onnx',
    voices: './kitten-nano-en-v0_8-int8/voices.bin',
    tokens: './kitten-nano-en-v0_8-int8/tokens.txt',
    dataDir: './kitten-nano-en-v0_8-int8/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    kitten: kitten,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with kitten-nano-en-v0_8-int8 with Swift API.

func run() {
  let kitten = sherpaOnnxOfflineTtsKittenModelConfig(
    model: "./kitten-nano-en-v0_8-int8/model.int8.onnx",
    voices: "./kitten-nano-en-v0_8-int8/voices.bin",
    tokens: "./kitten-nano-en-v0_8-int8/tokens.txt",
    dataDir: "./kitten-nano-en-v0_8-int8/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(kitten: kitten)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with kitten-nano-en-v0_8-int8 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Kitten.Model = "./kitten-nano-en-v0_8-int8/model.int8.onnx";
config.Model.Kitten.Voices = "./kitten-nano-en-v0_8-int8/voices.bin";
config.Model.Kitten.Tokens = "./kitten-nano-en-v0_8-int8/tokens.txt";
config.Model.Kitten.DataDir = "./kitten-nano-en-v0_8-int8/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with kitten-nano-en-v0_8-int8 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      kitten = OfflineTtsKittenModelConfig(
        model = "./kitten-nano-en-v0_8-int8/model.int8.onnx",
        voices = "./kitten-nano-en-v0_8-int8/voices.bin",
        tokens = "./kitten-nano-en-v0_8-int8/tokens.txt",
        dataDir = "./kitten-nano-en-v0_8-int8/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with kitten-nano-en-v0_8-int8 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var kitten = new OfflineTtsKittenModelConfig();
    kitten.setModel("./kitten-nano-en-v0_8-int8/model.int8.onnx");
    kitten.setVoices("./kitten-nano-en-v0_8-int8/voices.bin");
    kitten.setTokens("./kitten-nano-en-v0_8-int8/tokens.txt");
    kitten.setDataDir("./kitten-nano-en-v0_8-int8/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setKitten(kitten);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with kitten-nano-en-v0_8-int8 with Pascal API.

program test_kitten;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Kitten.Model := './kitten-nano-en-v0_8-int8/model.int8.onnx';
  Config.Model.Kitten.Voices := './kitten-nano-en-v0_8-int8/voices.bin';
  Config.Model.Kitten.Tokens := './kitten-nano-en-v0_8-int8/tokens.txt';
  Config.Model.Kitten.DataDir := './kitten-nano-en-v0_8-int8/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with kitten-nano-en-v0_8-int8 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Kitten: sherpa.OfflineTtsKittenModelConfig{
				Model:   "./kitten-nano-en-v0_8-int8/model.int8.onnx",
				Voices:  "./kitten-nano-en-v0_8-int8/voices.bin",
				Tokens:  "./kitten-nano-en-v0_8-int8/tokens.txt",
				DataDir: "./kitten-nano-en-v0_8-int8/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0 - expr-voice-2-m

Speaker 1 - expr-voice-2-f

Speaker 2 - expr-voice-3-m

Speaker 3 - expr-voice-3-f

Speaker 4 - expr-voice-4-m

Speaker 5 - expr-voice-4-f

Speaker 6 - expr-voice-5-m

Speaker 7 - expr-voice-5-f

kitten-micro-en-v0_8

Info about this model

This model is kitten-tts-micro-0.8 and it is from https://huggingface.co/KittenML/kitten-tts-micro-0.8

It supports only English.

Number of speakersSample rate
824000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-micro-en-v0_8.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Meaning of speaker suffix

SuffixMeaning
fFemale
mMale

speaker ID to speaker name (sid -> name)

The mapping from speaker ID (sid) to speaker name is given below:

0 - 10 -> expr-voice-2-m1 -> expr-voice-2-f
2 - 32 -> expr-voice-3-m3 -> expr-voice-3-f
4 - 54 -> expr-voice-4-m5 -> expr-voice-4-f
6 - 76 -> expr-voice-5-m7 -> expr-voice-5-f

speaker name to speaker ID (name -> sid)

The mapping from speaker name to speaker ID (sid) is given below:

0 - 1expr-voice-2-m -> 0expr-voice-2-f -> 1
2 - 3expr-voice-3-m -> 2expr-voice-3-f -> 3
4 - 5expr-voice-4-m -> 4expr-voice-4-f -> 5
6 - 7expr-voice-5-m -> 6expr-voice-5-f -> 7

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-micro-en-v0_8.tar.bz2

You can use the following code to play with kitten-micro-en-v0_8

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        kitten=sherpa_onnx.OfflineTtsKittenModelConfig(
            model="./kitten-micro-en-v0_8/model.onnx",
            voices="./kitten-micro-en-v0_8/voices.bin",
            tokens="./kitten-micro-en-v0_8/tokens.txt",
            data_dir="./kitten-micro-en-v0_8/espeak-ng-data",
        ),
        num_threads=2,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.", sid=0, speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with kitten-micro-en-v0_8 with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.kitten.model = "./kitten-micro-en-v0_8/model.onnx";
  config.model.kitten.voices = "./kitten-micro-en-v0_8/voices.bin";
  config.model.kitten.tokens = "./kitten-micro-en-v0_8/tokens.txt";
  config.model.kitten.data_dir = "./kitten-micro-en-v0_8/espeak-ng-data";

  config.model.num_threads = 1;
  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kitten.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-kitten \
  /tmp/test-kitten.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kitten

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kitten.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with kitten-micro-en-v0_8 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.kitten.model = "./kitten-micro-en-v0_8/model.onnx";
  config.model.kitten.voices = "./kitten-micro-en-v0_8/voices.bin";
  config.model.kitten.tokens = "./kitten-micro-en-v0_8/tokens.txt";
  config.model.kitten.data_dir = "./kitten-micro-en-v0_8/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kitten.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-kitten \
  /tmp/test-kitten.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kitten

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kitten.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with kitten-micro-en-v0_8 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKittenModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            kitten: OfflineTtsKittenModelConfig {
                model: Some("./kitten-micro-en-v0_8/model.onnx".into()),
                voices: Some("./kitten-micro-en-v0_8/voices.bin".into()),
                tokens: Some("./kitten-micro-en-v0_8/tokens.txt".into()),
                data_dir: Some("./kitten-micro-en-v0_8/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with kitten-micro-en-v0_8 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

async function createOfflineTtsAsync() {
  const config = {
    model: {
      kitten: {
        model: './kitten-micro-en-v0_8/model.onnx',
        voices: './kitten-micro-en-v0_8/voices.bin',
        tokens: './kitten-micro-en-v0_8/tokens.txt',
        dataDir: './kitten-micro-en-v0_8/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return await sherpa_onnx.OfflineTts.createAsync(config);
}

async function main() {
  const tts = await createOfflineTtsAsync();

  const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

  console.log('Number of speakers:', tts.numSpeakers);
  console.log('Sample rate:', tts.sampleRate);

  const start = Date.now();
  const generationConfig = new sherpa_onnx.GenerationConfig({
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  });

  const audio = await tts.generateAsync({
    text,
    generationConfig,
    onProgress({samples, progress}) {
      process.stdout.write(
          `\rGenerating... ${(progress * 100).toFixed(1)}%`);
      return true;
    },
  });

  console.log('\nGeneration finished.');

  const stop = Date.now();
  const elapsedSeconds = (stop - start) / 1000;
  const durationSeconds = audio.samples.length / audio.sampleRate;
  const realTimeFactor = elapsedSeconds / durationSeconds;

  console.log('Wave duration:', durationSeconds.toFixed(3), 'seconds');
  console.log('Elapsed time:', elapsedSeconds.toFixed(3), 'seconds');
  console.log(
      `RTF = ${elapsedSeconds.toFixed(3)} / ${durationSeconds.toFixed(3)} =`,
      realTimeFactor.toFixed(3));

  const filename = 'test.wav';
  sherpa_onnx.writeWave(filename, {
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  });

  console.log(`Saved to ${filename}`);
}

main().catch((err) => {
  console.error('TTS failed:', err);
  process.exit(1);
});

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with kitten-micro-en-v0_8 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final kitten = sherpa_onnx.OfflineTtsKittenModelConfig(
    model: './kitten-micro-en-v0_8/model.onnx',
    voices: './kitten-micro-en-v0_8/voices.bin',
    tokens: './kitten-micro-en-v0_8/tokens.txt',
    dataDir: './kitten-micro-en-v0_8/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    kitten: kitten,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with kitten-micro-en-v0_8 with Swift API.

func run() {
  let kitten = sherpaOnnxOfflineTtsKittenModelConfig(
    model: "./kitten-micro-en-v0_8/model.onnx",
    voices: "./kitten-micro-en-v0_8/voices.bin",
    tokens: "./kitten-micro-en-v0_8/tokens.txt",
    dataDir: "./kitten-micro-en-v0_8/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(kitten: kitten)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with kitten-micro-en-v0_8 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Kitten.Model = "./kitten-micro-en-v0_8/model.onnx";
config.Model.Kitten.Voices = "./kitten-micro-en-v0_8/voices.bin";
config.Model.Kitten.Tokens = "./kitten-micro-en-v0_8/tokens.txt";
config.Model.Kitten.DataDir = "./kitten-micro-en-v0_8/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with kitten-micro-en-v0_8 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      kitten = OfflineTtsKittenModelConfig(
        model = "./kitten-micro-en-v0_8/model.onnx",
        voices = "./kitten-micro-en-v0_8/voices.bin",
        tokens = "./kitten-micro-en-v0_8/tokens.txt",
        dataDir = "./kitten-micro-en-v0_8/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with kitten-micro-en-v0_8 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var kitten = new OfflineTtsKittenModelConfig();
    kitten.setModel("./kitten-micro-en-v0_8/model.onnx");
    kitten.setVoices("./kitten-micro-en-v0_8/voices.bin");
    kitten.setTokens("./kitten-micro-en-v0_8/tokens.txt");
    kitten.setDataDir("./kitten-micro-en-v0_8/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setKitten(kitten);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with kitten-micro-en-v0_8 with Pascal API.

program test_kitten;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Kitten.Model := './kitten-micro-en-v0_8/model.onnx';
  Config.Model.Kitten.Voices := './kitten-micro-en-v0_8/voices.bin';
  Config.Model.Kitten.Tokens := './kitten-micro-en-v0_8/tokens.txt';
  Config.Model.Kitten.DataDir := './kitten-micro-en-v0_8/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with kitten-micro-en-v0_8 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Kitten: sherpa.OfflineTtsKittenModelConfig{
				Model:   "./kitten-micro-en-v0_8/model.onnx",
				Voices:  "./kitten-micro-en-v0_8/voices.bin",
				Tokens:  "./kitten-micro-en-v0_8/tokens.txt",
				DataDir: "./kitten-micro-en-v0_8/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0 - expr-voice-2-m

Speaker 1 - expr-voice-2-f

Speaker 2 - expr-voice-3-m

Speaker 3 - expr-voice-3-f

Speaker 4 - expr-voice-4-m

Speaker 5 - expr-voice-4-f

Speaker 6 - expr-voice-5-m

Speaker 7 - expr-voice-5-f

kitten-mini-en-v0_8

Info about this model

This model is kitten-tts-mini-0.8 and it is from https://huggingface.co/KittenML/kitten-tts-mini-0.8

It supports only English.

Number of speakersSample rate
824000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-mini-en-v0_8.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Meaning of speaker suffix

SuffixMeaning
fFemale
mMale

speaker ID to speaker name (sid -> name)

The mapping from speaker ID (sid) to speaker name is given below:

0 - 10 -> expr-voice-2-m1 -> expr-voice-2-f
2 - 32 -> expr-voice-3-m3 -> expr-voice-3-f
4 - 54 -> expr-voice-4-m5 -> expr-voice-4-f
6 - 76 -> expr-voice-5-m7 -> expr-voice-5-f

speaker name to speaker ID (name -> sid)

The mapping from speaker name to speaker ID (sid) is given below:

0 - 1expr-voice-2-m -> 0expr-voice-2-f -> 1
2 - 3expr-voice-3-m -> 2expr-voice-3-f -> 3
4 - 5expr-voice-4-m -> 4expr-voice-4-f -> 5
6 - 7expr-voice-5-m -> 6expr-voice-5-f -> 7

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-mini-en-v0_8.tar.bz2

You can use the following code to play with kitten-mini-en-v0_8

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        kitten=sherpa_onnx.OfflineTtsKittenModelConfig(
            model="./kitten-mini-en-v0_8/model.onnx",
            voices="./kitten-mini-en-v0_8/voices.bin",
            tokens="./kitten-mini-en-v0_8/tokens.txt",
            data_dir="./kitten-mini-en-v0_8/espeak-ng-data",
        ),
        num_threads=2,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.", sid=0, speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with kitten-mini-en-v0_8 with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.kitten.model = "./kitten-mini-en-v0_8/model.onnx";
  config.model.kitten.voices = "./kitten-mini-en-v0_8/voices.bin";
  config.model.kitten.tokens = "./kitten-mini-en-v0_8/tokens.txt";
  config.model.kitten.data_dir = "./kitten-mini-en-v0_8/espeak-ng-data";

  config.model.num_threads = 1;
  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kitten.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-kitten \
  /tmp/test-kitten.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kitten

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kitten.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with kitten-mini-en-v0_8 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.kitten.model = "./kitten-mini-en-v0_8/model.onnx";
  config.model.kitten.voices = "./kitten-mini-en-v0_8/voices.bin";
  config.model.kitten.tokens = "./kitten-mini-en-v0_8/tokens.txt";
  config.model.kitten.data_dir = "./kitten-mini-en-v0_8/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kitten.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-kitten \
  /tmp/test-kitten.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kitten

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kitten.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with kitten-mini-en-v0_8 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKittenModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            kitten: OfflineTtsKittenModelConfig {
                model: Some("./kitten-mini-en-v0_8/model.onnx".into()),
                voices: Some("./kitten-mini-en-v0_8/voices.bin".into()),
                tokens: Some("./kitten-mini-en-v0_8/tokens.txt".into()),
                data_dir: Some("./kitten-mini-en-v0_8/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with kitten-mini-en-v0_8 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

async function createOfflineTtsAsync() {
  const config = {
    model: {
      kitten: {
        model: './kitten-mini-en-v0_8/model.onnx',
        voices: './kitten-mini-en-v0_8/voices.bin',
        tokens: './kitten-mini-en-v0_8/tokens.txt',
        dataDir: './kitten-mini-en-v0_8/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return await sherpa_onnx.OfflineTts.createAsync(config);
}

async function main() {
  const tts = await createOfflineTtsAsync();

  const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

  console.log('Number of speakers:', tts.numSpeakers);
  console.log('Sample rate:', tts.sampleRate);

  const start = Date.now();
  const generationConfig = new sherpa_onnx.GenerationConfig({
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  });

  const audio = await tts.generateAsync({
    text,
    generationConfig,
    onProgress({samples, progress}) {
      process.stdout.write(
          `\rGenerating... ${(progress * 100).toFixed(1)}%`);
      return true;
    },
  });

  console.log('\nGeneration finished.');

  const stop = Date.now();
  const elapsedSeconds = (stop - start) / 1000;
  const durationSeconds = audio.samples.length / audio.sampleRate;
  const realTimeFactor = elapsedSeconds / durationSeconds;

  console.log('Wave duration:', durationSeconds.toFixed(3), 'seconds');
  console.log('Elapsed time:', elapsedSeconds.toFixed(3), 'seconds');
  console.log(
      `RTF = ${elapsedSeconds.toFixed(3)} / ${durationSeconds.toFixed(3)} =`,
      realTimeFactor.toFixed(3));

  const filename = 'test.wav';
  sherpa_onnx.writeWave(filename, {
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  });

  console.log(`Saved to ${filename}`);
}

main().catch((err) => {
  console.error('TTS failed:', err);
  process.exit(1);
});

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with kitten-mini-en-v0_8 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final kitten = sherpa_onnx.OfflineTtsKittenModelConfig(
    model: './kitten-mini-en-v0_8/model.onnx',
    voices: './kitten-mini-en-v0_8/voices.bin',
    tokens: './kitten-mini-en-v0_8/tokens.txt',
    dataDir: './kitten-mini-en-v0_8/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    kitten: kitten,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with kitten-mini-en-v0_8 with Swift API.

func run() {
  let kitten = sherpaOnnxOfflineTtsKittenModelConfig(
    model: "./kitten-mini-en-v0_8/model.onnx",
    voices: "./kitten-mini-en-v0_8/voices.bin",
    tokens: "./kitten-mini-en-v0_8/tokens.txt",
    dataDir: "./kitten-mini-en-v0_8/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(kitten: kitten)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with kitten-mini-en-v0_8 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Kitten.Model = "./kitten-mini-en-v0_8/model.onnx";
config.Model.Kitten.Voices = "./kitten-mini-en-v0_8/voices.bin";
config.Model.Kitten.Tokens = "./kitten-mini-en-v0_8/tokens.txt";
config.Model.Kitten.DataDir = "./kitten-mini-en-v0_8/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with kitten-mini-en-v0_8 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      kitten = OfflineTtsKittenModelConfig(
        model = "./kitten-mini-en-v0_8/model.onnx",
        voices = "./kitten-mini-en-v0_8/voices.bin",
        tokens = "./kitten-mini-en-v0_8/tokens.txt",
        dataDir = "./kitten-mini-en-v0_8/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with kitten-mini-en-v0_8 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var kitten = new OfflineTtsKittenModelConfig();
    kitten.setModel("./kitten-mini-en-v0_8/model.onnx");
    kitten.setVoices("./kitten-mini-en-v0_8/voices.bin");
    kitten.setTokens("./kitten-mini-en-v0_8/tokens.txt");
    kitten.setDataDir("./kitten-mini-en-v0_8/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setKitten(kitten);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with kitten-mini-en-v0_8 with Pascal API.

program test_kitten;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Kitten.Model := './kitten-mini-en-v0_8/model.onnx';
  Config.Model.Kitten.Voices := './kitten-mini-en-v0_8/voices.bin';
  Config.Model.Kitten.Tokens := './kitten-mini-en-v0_8/tokens.txt';
  Config.Model.Kitten.DataDir := './kitten-mini-en-v0_8/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with kitten-mini-en-v0_8 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Kitten: sherpa.OfflineTtsKittenModelConfig{
				Model:   "./kitten-mini-en-v0_8/model.onnx",
				Voices:  "./kitten-mini-en-v0_8/voices.bin",
				Tokens:  "./kitten-mini-en-v0_8/tokens.txt",
				DataDir: "./kitten-mini-en-v0_8/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0 - expr-voice-2-m

Speaker 1 - expr-voice-2-f

Speaker 2 - expr-voice-3-m

Speaker 3 - expr-voice-3-f

Speaker 4 - expr-voice-4-m

Speaker 5 - expr-voice-4-f

Speaker 6 - expr-voice-5-m

Speaker 7 - expr-voice-5-f

kokoro-en-v0_19

Info about this model

This model is kokoro v0.19 and it is from https://huggingface.co/hexgrad/kLegacy

It supports only English.

Number of speakersSample rate
1124000

Meaning of speaker prefix

PrefixMeaningsid rangeNumber of speakers
afAmerican female0 - 45
amAmerican male5 - 62
bfBritish female7 - 82
bmBritish male9 - 102

speaker ID to speaker name (sid -> name)

The mapping from speaker ID (sid) to speaker name is given below:

0 - 30 -> af1 -> af_bella2 -> af_nicole3 -> af_sarah
4 - 74 -> af_sky5 -> am_adam6 -> am_michael7 -> bf_emma
8 - 108 -> bf_isabella9 -> bm_george10 -> bm_lewis

speaker name to speaker ID (name -> sid)

The mapping from speaker name to speaker ID (sid) is given below:

0 - 3af -> 0af_bella -> 1af_nicole -> 2af_sarah -> 3
4 - 7af_sky -> 4am_adam -> 5am_michael -> 6bf_emma -> 7
8 - 10bf_isabella -> 8bm_george -> 9bm_lewis -> 10

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2

You can use the following code to play with kokoro-en-v0_19

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        kokoro=sherpa_onnx.OfflineTtsKokoroModelConfig(
            model="kokoro-en-v0_19/model.onnx",
            voices="kokoro-en-v0_19/voices.bin",
            tokens="kokoro-en-v0_19/tokens.txt",
            data_dir="kokoro-en-v0_19/espeak-ng-data",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with kokoro-en-v0_19 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.kokoro.model = "kokoro-en-v0_19/model.onnx";
  config.model.kokoro.voices = "kokoro-en-v0_19/voices.bin";
  config.model.kokoro.tokens = "kokoro-en-v0_19/tokens.txt";
  config.model.kokoro.data_dir = "kokoro-en-v0_19/espeak-ng-data";

  config.model.num_threads = 1;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 0;

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

#if 0
  // If you don't want to use a callback, then please enable this branch
  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
#else
  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);
#endif

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kokoro.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-kokoro \
  /tmp/test-kokoro.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kokoro

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kokoro.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with kokoro-en-v0_19 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.kokoro.model = "kokoro-en-v0_19/model.onnx";
  config.model.kokoro.voices = "kokoro-en-v0_19/voices.bin";
  config.model.kokoro.tokens = "kokoro-en-v0_19/tokens.txt";
  config.model.kokoro.data_dir = "kokoro-en-v0_19/espeak-ng-data";

  config.model.num_threads = 1;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kokoro.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-kokoro \
  /tmp/test-kokoro.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kokoro

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kokoro.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with kokoro-en-v0_19 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKokoroModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            kokoro: OfflineTtsKokoroModelConfig {
                model: Some("kokoro-en-v0_19/model.onnx".into()),
                voices: Some("kokoro-en-v0_19/voices.bin".into()),
                tokens: Some("kokoro-en-v0_19/tokens.txt".into()),
                data_dir: Some("kokoro-en-v0_19/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with kokoro-en-v0_19 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      kokoro: {
        model: 'kokoro-en-v0_19/model.onnx',
        voices: 'kokoro-en-v0_19/voices.bin',
        tokens: 'kokoro-en-v0_19/tokens.txt',
        dataDir: 'kokoro-en-v0_19/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with kokoro-en-v0_19 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final kokoro = sherpa_onnx.OfflineTtsKokoroModelConfig(
    model: 'kokoro-en-v0_19/model.onnx',
    voices: 'kokoro-en-v0_19/voices.bin',
    tokens: 'kokoro-en-v0_19/tokens.txt',
    dataDir: 'kokoro-en-v0_19/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    kokoro: kokoro,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with kokoro-en-v0_19 with Swift API.

func run() {
  let kokoro = sherpaOnnxOfflineTtsKokoroModelConfig(
    model: "kokoro-en-v0_19/model.onnx",
    voices: "kokoro-en-v0_19/voices.bin",
    tokens: "kokoro-en-v0_19/tokens.txt",
    dataDir: "kokoro-en-v0_19/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(kokoro: kokoro)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with kokoro-en-v0_19 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Kokoro.Model = "kokoro-en-v0_19/model.onnx";
config.Model.Kokoro.Voices = "kokoro-en-v0_19/voices.bin";
config.Model.Kokoro.Tokens = "kokoro-en-v0_19/tokens.txt";
config.Model.Kokoro.DataDir = "kokoro-en-v0_19/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = ;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with kokoro-en-v0_19 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      kokoro = OfflineTtsKokoroModelConfig(
        model = "kokoro-en-v0_19/model.onnx",
        voices = "kokoro-en-v0_19/voices.bin",
        tokens = "kokoro-en-v0_19/tokens.txt",
        dataDir = "kokoro-en-v0_19/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = ,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with kokoro-en-v0_19 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var kokoro = new OfflineTtsKokoroModelConfig();
    kokoro.setModel("kokoro-en-v0_19/model.onnx");
    kokoro.setVoices("kokoro-en-v0_19/voices.bin");
    kokoro.setTokens("kokoro-en-v0_19/tokens.txt");
    kokoro.setDataDir("kokoro-en-v0_19/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setKokoro(kokoro);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with kokoro-en-v0_19 with Pascal API.

program test_kokoro;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Kokoro.Model := 'kokoro-en-v0_19/model.onnx';
  Config.Model.Kokoro.Voices := 'kokoro-en-v0_19/voices.bin';
  Config.Model.Kokoro.Tokens := 'kokoro-en-v0_19/tokens.txt';
  Config.Model.Kokoro.DataDir := 'kokoro-en-v0_19/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with kokoro-en-v0_19 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Kokoro: sherpa.OfflineTtsKokoroModelConfig{
				Model:  "kokoro-en-v0_19/model.onnx",
				Voices: "kokoro-en-v0_19/voices.bin",
				Tokens: "kokoro-en-v0_19/tokens.txt",
				DataDir: "kokoro-en-v0_19/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0 - af

Speaker 1 - af_bella

Speaker 2 - af_nicole

Speaker 3 - af_sarah

Speaker 4 - af_sky

Speaker 5 - am_adam

Speaker 6 - am_michael

Speaker 7 - bf_emma

Speaker 8 - bf_isabella

Speaker 9 - bm_george

Speaker 10 - bm_lewis

supertonic-3-en

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for English (en).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "en"

audio = tts.generate("How are you doing today? This is a text-to-speech engine using next generation Kaldi.", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "How are you doing today? This is a text-to-speech engine using next generation Kaldi.";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"en\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "How are you doing today? This is a text-to-speech engine using next generation Kaldi.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "en"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "How are you doing today? This is a text-to-speech engine using next generation Kaldi.";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "en"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'How are you doing today? This is a text-to-speech engine using next generation Kaldi.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'en'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'en'},
  );
  final audio = tts.generateWithConfig(text: 'How are you doing today? This is a text-to-speech engine using next generation Kaldi.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "How are you doing today? This is a text-to-speech engine using next generation Kaldi."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "en"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "How are you doing today? This is a text-to-speech engine using next generation Kaldi.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"en\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "en"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "How are you doing today? This is a text-to-speech engine using next generation Kaldi.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "How are you doing today? This is a text-to-speech engine using next generation Kaldi.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"en\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "en"}';

  Audio := Tts.GenerateWithConfig('How are you doing today? This is a text-to-speech engine using next generation Kaldi.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "How are you doing today? This is a text-to-speech engine using next generation Kaldi."

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "en"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Hello world.

1

How are you today?

2

The sky is blue.

3

I love machine learning.

4

Python is awesome.

5

Good morning everyone.

6

Artificial intelligence is growing.

7

Speech synthesis is fascinating.

8

Neural networks are powerful.

9

Text to speech converts text to audio.

10

The quick brown fox jumps over the lazy dog.

11

Machine learning enables computers to learn from data.

12

Natural language processing helps machines understand text.

13

Deep learning has revolutionized artificial intelligence.

14

Speech synthesis technology has advanced significantly.

15

Neural voice cloning can replicate speaking styles.

16

Text normalization is important for proper pronunciation.

17

Voice assistants help us interact with technology naturally.

18

Modern TTS systems use deep learning for high-quality speech.

19

Human computer interaction has become more intuitive.

Speaker 1

0

Hello world.

1

How are you today?

2

The sky is blue.

3

I love machine learning.

4

Python is awesome.

5

Good morning everyone.

6

Artificial intelligence is growing.

7

Speech synthesis is fascinating.

8

Neural networks are powerful.

9

Text to speech converts text to audio.

10

The quick brown fox jumps over the lazy dog.

11

Machine learning enables computers to learn from data.

12

Natural language processing helps machines understand text.

13

Deep learning has revolutionized artificial intelligence.

14

Speech synthesis technology has advanced significantly.

15

Neural voice cloning can replicate speaking styles.

16

Text normalization is important for proper pronunciation.

17

Voice assistants help us interact with technology naturally.

18

Modern TTS systems use deep learning for high-quality speech.

19

Human computer interaction has become more intuitive.

Speaker 2

0

Hello world.

1

How are you today?

2

The sky is blue.

3

I love machine learning.

4

Python is awesome.

5

Good morning everyone.

6

Artificial intelligence is growing.

7

Speech synthesis is fascinating.

8

Neural networks are powerful.

9

Text to speech converts text to audio.

10

The quick brown fox jumps over the lazy dog.

11

Machine learning enables computers to learn from data.

12

Natural language processing helps machines understand text.

13

Deep learning has revolutionized artificial intelligence.

14

Speech synthesis technology has advanced significantly.

15

Neural voice cloning can replicate speaking styles.

16

Text normalization is important for proper pronunciation.

17

Voice assistants help us interact with technology naturally.

18

Modern TTS systems use deep learning for high-quality speech.

19

Human computer interaction has become more intuitive.

Speaker 3

0

Hello world.

1

How are you today?

2

The sky is blue.

3

I love machine learning.

4

Python is awesome.

5

Good morning everyone.

6

Artificial intelligence is growing.

7

Speech synthesis is fascinating.

8

Neural networks are powerful.

9

Text to speech converts text to audio.

10

The quick brown fox jumps over the lazy dog.

11

Machine learning enables computers to learn from data.

12

Natural language processing helps machines understand text.

13

Deep learning has revolutionized artificial intelligence.

14

Speech synthesis technology has advanced significantly.

15

Neural voice cloning can replicate speaking styles.

16

Text normalization is important for proper pronunciation.

17

Voice assistants help us interact with technology naturally.

18

Modern TTS systems use deep learning for high-quality speech.

19

Human computer interaction has become more intuitive.

Speaker 4

0

Hello world.

1

How are you today?

2

The sky is blue.

3

I love machine learning.

4

Python is awesome.

5

Good morning everyone.

6

Artificial intelligence is growing.

7

Speech synthesis is fascinating.

8

Neural networks are powerful.

9

Text to speech converts text to audio.

10

The quick brown fox jumps over the lazy dog.

11

Machine learning enables computers to learn from data.

12

Natural language processing helps machines understand text.

13

Deep learning has revolutionized artificial intelligence.

14

Speech synthesis technology has advanced significantly.

15

Neural voice cloning can replicate speaking styles.

16

Text normalization is important for proper pronunciation.

17

Voice assistants help us interact with technology naturally.

18

Modern TTS systems use deep learning for high-quality speech.

19

Human computer interaction has become more intuitive.

Speaker 5

0

Hello world.

1

How are you today?

2

The sky is blue.

3

I love machine learning.

4

Python is awesome.

5

Good morning everyone.

6

Artificial intelligence is growing.

7

Speech synthesis is fascinating.

8

Neural networks are powerful.

9

Text to speech converts text to audio.

10

The quick brown fox jumps over the lazy dog.

11

Machine learning enables computers to learn from data.

12

Natural language processing helps machines understand text.

13

Deep learning has revolutionized artificial intelligence.

14

Speech synthesis technology has advanced significantly.

15

Neural voice cloning can replicate speaking styles.

16

Text normalization is important for proper pronunciation.

17

Voice assistants help us interact with technology naturally.

18

Modern TTS systems use deep learning for high-quality speech.

19

Human computer interaction has become more intuitive.

Speaker 6

0

Hello world.

1

How are you today?

2

The sky is blue.

3

I love machine learning.

4

Python is awesome.

5

Good morning everyone.

6

Artificial intelligence is growing.

7

Speech synthesis is fascinating.

8

Neural networks are powerful.

9

Text to speech converts text to audio.

10

The quick brown fox jumps over the lazy dog.

11

Machine learning enables computers to learn from data.

12

Natural language processing helps machines understand text.

13

Deep learning has revolutionized artificial intelligence.

14

Speech synthesis technology has advanced significantly.

15

Neural voice cloning can replicate speaking styles.

16

Text normalization is important for proper pronunciation.

17

Voice assistants help us interact with technology naturally.

18

Modern TTS systems use deep learning for high-quality speech.

19

Human computer interaction has become more intuitive.

Speaker 7

0

Hello world.

1

How are you today?

2

The sky is blue.

3

I love machine learning.

4

Python is awesome.

5

Good morning everyone.

6

Artificial intelligence is growing.

7

Speech synthesis is fascinating.

8

Neural networks are powerful.

9

Text to speech converts text to audio.

10

The quick brown fox jumps over the lazy dog.

11

Machine learning enables computers to learn from data.

12

Natural language processing helps machines understand text.

13

Deep learning has revolutionized artificial intelligence.

14

Speech synthesis technology has advanced significantly.

15

Neural voice cloning can replicate speaking styles.

16

Text normalization is important for proper pronunciation.

17

Voice assistants help us interact with technology naturally.

18

Modern TTS systems use deep learning for high-quality speech.

19

Human computer interaction has become more intuitive.

Speaker 8

0

Hello world.

1

How are you today?

2

The sky is blue.

3

I love machine learning.

4

Python is awesome.

5

Good morning everyone.

6

Artificial intelligence is growing.

7

Speech synthesis is fascinating.

8

Neural networks are powerful.

9

Text to speech converts text to audio.

10

The quick brown fox jumps over the lazy dog.

11

Machine learning enables computers to learn from data.

12

Natural language processing helps machines understand text.

13

Deep learning has revolutionized artificial intelligence.

14

Speech synthesis technology has advanced significantly.

15

Neural voice cloning can replicate speaking styles.

16

Text normalization is important for proper pronunciation.

17

Voice assistants help us interact with technology naturally.

18

Modern TTS systems use deep learning for high-quality speech.

19

Human computer interaction has become more intuitive.

Speaker 9

0

Hello world.

1

How are you today?

2

The sky is blue.

3

I love machine learning.

4

Python is awesome.

5

Good morning everyone.

6

Artificial intelligence is growing.

7

Speech synthesis is fascinating.

8

Neural networks are powerful.

9

Text to speech converts text to audio.

10

The quick brown fox jumps over the lazy dog.

11

Machine learning enables computers to learn from data.

12

Natural language processing helps machines understand text.

13

Deep learning has revolutionized artificial intelligence.

14

Speech synthesis technology has advanced significantly.

15

Neural voice cloning can replicate speaking styles.

16

Text normalization is important for proper pronunciation.

17

Voice assistants help us interact with technology naturally.

18

Modern TTS systems use deep learning for high-quality speech.

19

Human computer interaction has become more intuitive.

vits-piper-en_GB-alan-low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/alan/low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-alan-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_GB-alan-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_GB-alan-low/en_GB-alan-low.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-alan-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-alan-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-alan-low.tar.bz2

You can use the following code to play with vits-piper-en_GB-alan-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_GB-alan-low/en_GB-alan-low.onnx",
            data_dir="vits-piper-en_GB-alan-low/espeak-ng-data",
            tokens="vits-piper-en_GB-alan-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_GB-alan-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_GB-alan-low/en_GB-alan-low.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-alan-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-alan-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_GB-alan-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_GB-alan-low/en_GB-alan-low.onnx".into()),
                tokens: Some("vits-piper-en_GB-alan-low/tokens.txt".into()),
                data_dir: Some("vits-piper-en_GB-alan-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_GB-alan-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_GB-alan-low/en_GB-alan-low.onnx',
        tokens: 'vits-piper-en_GB-alan-low/tokens.txt',
        dataDir: 'vits-piper-en_GB-alan-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_GB-alan-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_GB-alan-low/en_GB-alan-low.onnx',
    tokens: 'vits-piper-en_GB-alan-low/tokens.txt',
    dataDir: 'vits-piper-en_GB-alan-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_GB-alan-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_GB-alan-low/en_GB-alan-low.onnx",
    lexicon: "",
    tokens: "vits-piper-en_GB-alan-low/tokens.txt",
    dataDir: "vits-piper-en_GB-alan-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_GB-alan-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-alan-low/en_GB-alan-low.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-alan-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-alan-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_GB-alan-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_GB-alan-low/en_GB-alan-low.onnx",
        tokens = "vits-piper-en_GB-alan-low/tokens.txt",
        dataDir = "vits-piper-en_GB-alan-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_GB-alan-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_GB-alan-low/en_GB-alan-low.onnx");
    vits.setTokens("vits-piper-en_GB-alan-low/tokens.txt");
    vits.setDataDir("vits-piper-en_GB-alan-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_GB-alan-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_GB-alan-low/en_GB-alan-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_GB-alan-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_GB-alan-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_GB-alan-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_GB-alan-low/en_GB-alan-low.onnx",
				Tokens:  "vits-piper-en_GB-alan-low/tokens.txt",
				DataDir: "vits-piper-en_GB-alan-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_GB-alan-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/alan/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-alan-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_GB-alan-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-alan-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-alan-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-alan-medium.tar.bz2

You can use the following code to play with vits-piper-en_GB-alan-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx",
            data_dir="vits-piper-en_GB-alan-medium/espeak-ng-data",
            tokens="vits-piper-en_GB-alan-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_GB-alan-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-alan-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-alan-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_GB-alan-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx".into()),
                tokens: Some("vits-piper-en_GB-alan-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_GB-alan-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_GB-alan-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx',
        tokens: 'vits-piper-en_GB-alan-medium/tokens.txt',
        dataDir: 'vits-piper-en_GB-alan-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_GB-alan-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx',
    tokens: 'vits-piper-en_GB-alan-medium/tokens.txt',
    dataDir: 'vits-piper-en_GB-alan-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_GB-alan-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_GB-alan-medium/tokens.txt",
    dataDir: "vits-piper-en_GB-alan-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_GB-alan-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-alan-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-alan-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_GB-alan-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx",
        tokens = "vits-piper-en_GB-alan-medium/tokens.txt",
        dataDir = "vits-piper-en_GB-alan-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_GB-alan-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx");
    vits.setTokens("vits-piper-en_GB-alan-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_GB-alan-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_GB-alan-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_GB-alan-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_GB-alan-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_GB-alan-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx",
				Tokens:  "vits-piper-en_GB-alan-medium/tokens.txt",
				DataDir: "vits-piper-en_GB-alan-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_GB-alba-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/alba/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-alba-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_GB-alba-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-alba-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-alba-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-alba-medium.tar.bz2

You can use the following code to play with vits-piper-en_GB-alba-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx",
            data_dir="vits-piper-en_GB-alba-medium/espeak-ng-data",
            tokens="vits-piper-en_GB-alba-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_GB-alba-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-alba-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-alba-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_GB-alba-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx".into()),
                tokens: Some("vits-piper-en_GB-alba-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_GB-alba-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_GB-alba-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx',
        tokens: 'vits-piper-en_GB-alba-medium/tokens.txt',
        dataDir: 'vits-piper-en_GB-alba-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_GB-alba-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx',
    tokens: 'vits-piper-en_GB-alba-medium/tokens.txt',
    dataDir: 'vits-piper-en_GB-alba-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_GB-alba-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_GB-alba-medium/tokens.txt",
    dataDir: "vits-piper-en_GB-alba-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_GB-alba-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-alba-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-alba-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_GB-alba-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx",
        tokens = "vits-piper-en_GB-alba-medium/tokens.txt",
        dataDir = "vits-piper-en_GB-alba-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_GB-alba-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx");
    vits.setTokens("vits-piper-en_GB-alba-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_GB-alba-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_GB-alba-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_GB-alba-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_GB-alba-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_GB-alba-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx",
				Tokens:  "vits-piper-en_GB-alba-medium/tokens.txt",
				DataDir: "vits-piper-en_GB-alba-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_GB-aru-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/aru/medium

Number of speakersSample rate
1222050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-aru-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_GB-aru-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-aru-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-aru-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-aru-medium.tar.bz2

You can use the following code to play with vits-piper-en_GB-aru-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx",
            data_dir="vits-piper-en_GB-aru-medium/espeak-ng-data",
            tokens="vits-piper-en_GB-aru-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_GB-aru-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-aru-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-aru-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_GB-aru-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx".into()),
                tokens: Some("vits-piper-en_GB-aru-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_GB-aru-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_GB-aru-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx',
        tokens: 'vits-piper-en_GB-aru-medium/tokens.txt',
        dataDir: 'vits-piper-en_GB-aru-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_GB-aru-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx',
    tokens: 'vits-piper-en_GB-aru-medium/tokens.txt',
    dataDir: 'vits-piper-en_GB-aru-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_GB-aru-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_GB-aru-medium/tokens.txt",
    dataDir: "vits-piper-en_GB-aru-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_GB-aru-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-aru-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-aru-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_GB-aru-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx",
        tokens = "vits-piper-en_GB-aru-medium/tokens.txt",
        dataDir = "vits-piper-en_GB-aru-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_GB-aru-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx");
    vits.setTokens("vits-piper-en_GB-aru-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_GB-aru-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_GB-aru-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_GB-aru-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_GB-aru-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_GB-aru-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx",
				Tokens:  "vits-piper-en_GB-aru-medium/tokens.txt",
				DataDir: "vits-piper-en_GB-aru-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

Speaker 1

Speaker 2

Speaker 3

Speaker 4

Speaker 5

Speaker 6

Speaker 7

Speaker 8

Speaker 9

Speaker 10

Speaker 11

vits-piper-en_GB-cori-high

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/cori/high

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-cori-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_GB-cori-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_GB-cori-high/en_GB-cori-high.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-cori-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-cori-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-cori-high.tar.bz2

You can use the following code to play with vits-piper-en_GB-cori-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_GB-cori-high/en_GB-cori-high.onnx",
            data_dir="vits-piper-en_GB-cori-high/espeak-ng-data",
            tokens="vits-piper-en_GB-cori-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_GB-cori-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_GB-cori-high/en_GB-cori-high.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-cori-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-cori-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_GB-cori-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_GB-cori-high/en_GB-cori-high.onnx".into()),
                tokens: Some("vits-piper-en_GB-cori-high/tokens.txt".into()),
                data_dir: Some("vits-piper-en_GB-cori-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_GB-cori-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_GB-cori-high/en_GB-cori-high.onnx',
        tokens: 'vits-piper-en_GB-cori-high/tokens.txt',
        dataDir: 'vits-piper-en_GB-cori-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_GB-cori-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_GB-cori-high/en_GB-cori-high.onnx',
    tokens: 'vits-piper-en_GB-cori-high/tokens.txt',
    dataDir: 'vits-piper-en_GB-cori-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_GB-cori-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_GB-cori-high/en_GB-cori-high.onnx",
    lexicon: "",
    tokens: "vits-piper-en_GB-cori-high/tokens.txt",
    dataDir: "vits-piper-en_GB-cori-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_GB-cori-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-cori-high/en_GB-cori-high.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-cori-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-cori-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_GB-cori-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_GB-cori-high/en_GB-cori-high.onnx",
        tokens = "vits-piper-en_GB-cori-high/tokens.txt",
        dataDir = "vits-piper-en_GB-cori-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_GB-cori-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_GB-cori-high/en_GB-cori-high.onnx");
    vits.setTokens("vits-piper-en_GB-cori-high/tokens.txt");
    vits.setDataDir("vits-piper-en_GB-cori-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_GB-cori-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_GB-cori-high/en_GB-cori-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_GB-cori-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_GB-cori-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_GB-cori-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_GB-cori-high/en_GB-cori-high.onnx",
				Tokens:  "vits-piper-en_GB-cori-high/tokens.txt",
				DataDir: "vits-piper-en_GB-cori-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_GB-cori-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/cori/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-cori-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_GB-cori-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-cori-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-cori-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-cori-medium.tar.bz2

You can use the following code to play with vits-piper-en_GB-cori-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx",
            data_dir="vits-piper-en_GB-cori-medium/espeak-ng-data",
            tokens="vits-piper-en_GB-cori-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_GB-cori-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-cori-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-cori-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_GB-cori-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx".into()),
                tokens: Some("vits-piper-en_GB-cori-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_GB-cori-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_GB-cori-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx',
        tokens: 'vits-piper-en_GB-cori-medium/tokens.txt',
        dataDir: 'vits-piper-en_GB-cori-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_GB-cori-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx',
    tokens: 'vits-piper-en_GB-cori-medium/tokens.txt',
    dataDir: 'vits-piper-en_GB-cori-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_GB-cori-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_GB-cori-medium/tokens.txt",
    dataDir: "vits-piper-en_GB-cori-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_GB-cori-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-cori-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-cori-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_GB-cori-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx",
        tokens = "vits-piper-en_GB-cori-medium/tokens.txt",
        dataDir = "vits-piper-en_GB-cori-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_GB-cori-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx");
    vits.setTokens("vits-piper-en_GB-cori-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_GB-cori-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_GB-cori-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_GB-cori-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_GB-cori-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_GB-cori-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx",
				Tokens:  "vits-piper-en_GB-cori-medium/tokens.txt",
				DataDir: "vits-piper-en_GB-cori-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_GB-dii-high

Info about this model

This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_en-GB_dii

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-dii-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_GB-dii-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_GB-dii-high/en_GB-dii-high.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-dii-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-dii-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-dii-high.tar.bz2

You can use the following code to play with vits-piper-en_GB-dii-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_GB-dii-high/en_GB-dii-high.onnx",
            data_dir="vits-piper-en_GB-dii-high/espeak-ng-data",
            tokens="vits-piper-en_GB-dii-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_GB-dii-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_GB-dii-high/en_GB-dii-high.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-dii-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-dii-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_GB-dii-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_GB-dii-high/en_GB-dii-high.onnx".into()),
                tokens: Some("vits-piper-en_GB-dii-high/tokens.txt".into()),
                data_dir: Some("vits-piper-en_GB-dii-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_GB-dii-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_GB-dii-high/en_GB-dii-high.onnx',
        tokens: 'vits-piper-en_GB-dii-high/tokens.txt',
        dataDir: 'vits-piper-en_GB-dii-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_GB-dii-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_GB-dii-high/en_GB-dii-high.onnx',
    tokens: 'vits-piper-en_GB-dii-high/tokens.txt',
    dataDir: 'vits-piper-en_GB-dii-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_GB-dii-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_GB-dii-high/en_GB-dii-high.onnx",
    lexicon: "",
    tokens: "vits-piper-en_GB-dii-high/tokens.txt",
    dataDir: "vits-piper-en_GB-dii-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_GB-dii-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-dii-high/en_GB-dii-high.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-dii-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-dii-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_GB-dii-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_GB-dii-high/en_GB-dii-high.onnx",
        tokens = "vits-piper-en_GB-dii-high/tokens.txt",
        dataDir = "vits-piper-en_GB-dii-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_GB-dii-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_GB-dii-high/en_GB-dii-high.onnx");
    vits.setTokens("vits-piper-en_GB-dii-high/tokens.txt");
    vits.setDataDir("vits-piper-en_GB-dii-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_GB-dii-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_GB-dii-high/en_GB-dii-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_GB-dii-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_GB-dii-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_GB-dii-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_GB-dii-high/en_GB-dii-high.onnx",
				Tokens:  "vits-piper-en_GB-dii-high/tokens.txt",
				DataDir: "vits-piper-en_GB-dii-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_GB-jenny_dioco-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/jenny_dioco/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-jenny_dioco-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-jenny_dioco-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-jenny_dioco-medium.tar.bz2

You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx",
            data_dir="vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data",
            tokens="vits-piper-en_GB-jenny_dioco-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-jenny_dioco-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx".into()),
                tokens: Some("vits-piper-en_GB-jenny_dioco-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx',
        tokens: 'vits-piper-en_GB-jenny_dioco-medium/tokens.txt',
        dataDir: 'vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx',
    tokens: 'vits-piper-en_GB-jenny_dioco-medium/tokens.txt',
    dataDir: 'vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_GB-jenny_dioco-medium/tokens.txt",
    dataDir: "vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-jenny_dioco-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx",
        tokens = "vits-piper-en_GB-jenny_dioco-medium/tokens.txt",
        dataDir = "vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx");
    vits.setTokens("vits-piper-en_GB-jenny_dioco-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_GB-jenny_dioco-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx",
				Tokens:  "vits-piper-en_GB-jenny_dioco-medium/tokens.txt",
				DataDir: "vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_GB-miro-high

Info about this model

This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_en-GB_miro

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-miro-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_GB-miro-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_GB-miro-high/en_GB-miro-high.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-miro-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-miro-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-miro-high.tar.bz2

You can use the following code to play with vits-piper-en_GB-miro-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_GB-miro-high/en_GB-miro-high.onnx",
            data_dir="vits-piper-en_GB-miro-high/espeak-ng-data",
            tokens="vits-piper-en_GB-miro-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_GB-miro-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_GB-miro-high/en_GB-miro-high.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-miro-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-miro-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_GB-miro-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_GB-miro-high/en_GB-miro-high.onnx".into()),
                tokens: Some("vits-piper-en_GB-miro-high/tokens.txt".into()),
                data_dir: Some("vits-piper-en_GB-miro-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_GB-miro-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_GB-miro-high/en_GB-miro-high.onnx',
        tokens: 'vits-piper-en_GB-miro-high/tokens.txt',
        dataDir: 'vits-piper-en_GB-miro-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_GB-miro-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_GB-miro-high/en_GB-miro-high.onnx',
    tokens: 'vits-piper-en_GB-miro-high/tokens.txt',
    dataDir: 'vits-piper-en_GB-miro-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_GB-miro-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_GB-miro-high/en_GB-miro-high.onnx",
    lexicon: "",
    tokens: "vits-piper-en_GB-miro-high/tokens.txt",
    dataDir: "vits-piper-en_GB-miro-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_GB-miro-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-miro-high/en_GB-miro-high.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-miro-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-miro-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_GB-miro-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_GB-miro-high/en_GB-miro-high.onnx",
        tokens = "vits-piper-en_GB-miro-high/tokens.txt",
        dataDir = "vits-piper-en_GB-miro-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_GB-miro-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_GB-miro-high/en_GB-miro-high.onnx");
    vits.setTokens("vits-piper-en_GB-miro-high/tokens.txt");
    vits.setDataDir("vits-piper-en_GB-miro-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_GB-miro-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_GB-miro-high/en_GB-miro-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_GB-miro-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_GB-miro-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_GB-miro-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_GB-miro-high/en_GB-miro-high.onnx",
				Tokens:  "vits-piper-en_GB-miro-high/tokens.txt",
				DataDir: "vits-piper-en_GB-miro-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_GB-northern_english_male-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/northern_english_male/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-northern_english_male-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-northern_english_male-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-northern_english_male-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-northern_english_male-medium.tar.bz2

You can use the following code to play with vits-piper-en_GB-northern_english_male-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx",
            data_dir="vits-piper-en_GB-northern_english_male-medium/espeak-ng-data",
            tokens="vits-piper-en_GB-northern_english_male-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-northern_english_male-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-northern_english_male-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx".into()),
                tokens: Some("vits-piper-en_GB-northern_english_male-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_GB-northern_english_male-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx',
        tokens: 'vits-piper-en_GB-northern_english_male-medium/tokens.txt',
        dataDir: 'vits-piper-en_GB-northern_english_male-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx',
    tokens: 'vits-piper-en_GB-northern_english_male-medium/tokens.txt',
    dataDir: 'vits-piper-en_GB-northern_english_male-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_GB-northern_english_male-medium/tokens.txt",
    dataDir: "vits-piper-en_GB-northern_english_male-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-northern_english_male-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-northern_english_male-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx",
        tokens = "vits-piper-en_GB-northern_english_male-medium/tokens.txt",
        dataDir = "vits-piper-en_GB-northern_english_male-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx");
    vits.setTokens("vits-piper-en_GB-northern_english_male-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_GB-northern_english_male-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_GB-northern_english_male-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_GB-northern_english_male-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx",
				Tokens:  "vits-piper-en_GB-northern_english_male-medium/tokens.txt",
				DataDir: "vits-piper-en_GB-northern_english_male-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_GB-semaine-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/semaine/medium

Number of speakersSample rate
422050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-semaine-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_GB-semaine-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-semaine-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-semaine-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-semaine-medium.tar.bz2

You can use the following code to play with vits-piper-en_GB-semaine-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx",
            data_dir="vits-piper-en_GB-semaine-medium/espeak-ng-data",
            tokens="vits-piper-en_GB-semaine-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_GB-semaine-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-semaine-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-semaine-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_GB-semaine-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx".into()),
                tokens: Some("vits-piper-en_GB-semaine-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_GB-semaine-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_GB-semaine-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx',
        tokens: 'vits-piper-en_GB-semaine-medium/tokens.txt',
        dataDir: 'vits-piper-en_GB-semaine-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_GB-semaine-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx',
    tokens: 'vits-piper-en_GB-semaine-medium/tokens.txt',
    dataDir: 'vits-piper-en_GB-semaine-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_GB-semaine-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_GB-semaine-medium/tokens.txt",
    dataDir: "vits-piper-en_GB-semaine-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_GB-semaine-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-semaine-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-semaine-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_GB-semaine-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx",
        tokens = "vits-piper-en_GB-semaine-medium/tokens.txt",
        dataDir = "vits-piper-en_GB-semaine-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_GB-semaine-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx");
    vits.setTokens("vits-piper-en_GB-semaine-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_GB-semaine-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_GB-semaine-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_GB-semaine-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_GB-semaine-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_GB-semaine-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx",
				Tokens:  "vits-piper-en_GB-semaine-medium/tokens.txt",
				DataDir: "vits-piper-en_GB-semaine-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

Speaker 1

Speaker 2

Speaker 3

vits-piper-en_GB-southern_english_female-low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/southern_english_female/low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-southern_english_female-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_female-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-southern_english_female-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-southern_english_female-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-southern_english_female-low.tar.bz2

You can use the following code to play with vits-piper-en_GB-southern_english_female-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx",
            data_dir="vits-piper-en_GB-southern_english_female-low/espeak-ng-data",
            tokens="vits-piper-en_GB-southern_english_female-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_female-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-southern_english_female-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-southern_english_female-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_female-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx".into()),
                tokens: Some("vits-piper-en_GB-southern_english_female-low/tokens.txt".into()),
                data_dir: Some("vits-piper-en_GB-southern_english_female-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_GB-southern_english_female-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx',
        tokens: 'vits-piper-en_GB-southern_english_female-low/tokens.txt',
        dataDir: 'vits-piper-en_GB-southern_english_female-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_female-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx',
    tokens: 'vits-piper-en_GB-southern_english_female-low/tokens.txt',
    dataDir: 'vits-piper-en_GB-southern_english_female-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_female-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx",
    lexicon: "",
    tokens: "vits-piper-en_GB-southern_english_female-low/tokens.txt",
    dataDir: "vits-piper-en_GB-southern_english_female-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_female-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-southern_english_female-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-southern_english_female-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_female-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx",
        tokens = "vits-piper-en_GB-southern_english_female-low/tokens.txt",
        dataDir = "vits-piper-en_GB-southern_english_female-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_female-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx");
    vits.setTokens("vits-piper-en_GB-southern_english_female-low/tokens.txt");
    vits.setDataDir("vits-piper-en_GB-southern_english_female-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_female-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_GB-southern_english_female-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_GB-southern_english_female-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_female-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx",
				Tokens:  "vits-piper-en_GB-southern_english_female-low/tokens.txt",
				DataDir: "vits-piper-en_GB-southern_english_female-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_GB-southern_english_female-medium

Info about this model

This model is converted from https://huggingface.co/csukuangfj/vits-piper-en_GB-southern_english_female-medium

Number of speakersSample rate
622050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-southern_english_female-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-southern_english_female-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-southern_english_female-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-southern_english_female-medium.tar.bz2

You can use the following code to play with vits-piper-en_GB-southern_english_female-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx",
            data_dir="vits-piper-en_GB-southern_english_female-medium/espeak-ng-data",
            tokens="vits-piper-en_GB-southern_english_female-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-southern_english_female-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-southern_english_female-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx".into()),
                tokens: Some("vits-piper-en_GB-southern_english_female-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_GB-southern_english_female-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx',
        tokens: 'vits-piper-en_GB-southern_english_female-medium/tokens.txt',
        dataDir: 'vits-piper-en_GB-southern_english_female-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx',
    tokens: 'vits-piper-en_GB-southern_english_female-medium/tokens.txt',
    dataDir: 'vits-piper-en_GB-southern_english_female-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_GB-southern_english_female-medium/tokens.txt",
    dataDir: "vits-piper-en_GB-southern_english_female-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-southern_english_female-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-southern_english_female-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx",
        tokens = "vits-piper-en_GB-southern_english_female-medium/tokens.txt",
        dataDir = "vits-piper-en_GB-southern_english_female-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx");
    vits.setTokens("vits-piper-en_GB-southern_english_female-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_GB-southern_english_female-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_GB-southern_english_female-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_GB-southern_english_female-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx",
				Tokens:  "vits-piper-en_GB-southern_english_female-medium/tokens.txt",
				DataDir: "vits-piper-en_GB-southern_english_female-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

Speaker 1

Speaker 2

Speaker 3

Speaker 4

Speaker 5

vits-piper-en_GB-southern_english_male-medium

Info about this model

This model is converted from https://huggingface.co/csukuangfj/vits-piper-en_GB-southern_english_male-medium

Number of speakersSample rate
822050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-southern_english_male-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-southern_english_male-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-southern_english_male-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-southern_english_male-medium.tar.bz2

You can use the following code to play with vits-piper-en_GB-southern_english_male-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx",
            data_dir="vits-piper-en_GB-southern_english_male-medium/espeak-ng-data",
            tokens="vits-piper-en_GB-southern_english_male-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-southern_english_male-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-southern_english_male-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx".into()),
                tokens: Some("vits-piper-en_GB-southern_english_male-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_GB-southern_english_male-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx',
        tokens: 'vits-piper-en_GB-southern_english_male-medium/tokens.txt',
        dataDir: 'vits-piper-en_GB-southern_english_male-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx',
    tokens: 'vits-piper-en_GB-southern_english_male-medium/tokens.txt',
    dataDir: 'vits-piper-en_GB-southern_english_male-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_GB-southern_english_male-medium/tokens.txt",
    dataDir: "vits-piper-en_GB-southern_english_male-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-southern_english_male-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-southern_english_male-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx",
        tokens = "vits-piper-en_GB-southern_english_male-medium/tokens.txt",
        dataDir = "vits-piper-en_GB-southern_english_male-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx");
    vits.setTokens("vits-piper-en_GB-southern_english_male-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_GB-southern_english_male-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_GB-southern_english_male-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_GB-southern_english_male-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx",
				Tokens:  "vits-piper-en_GB-southern_english_male-medium/tokens.txt",
				DataDir: "vits-piper-en_GB-southern_english_male-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

Speaker 1

Speaker 2

Speaker 3

Speaker 4

Speaker 5

Speaker 6

Speaker 7

vits-piper-en_GB-vctk-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/vctk/medium

Number of speakersSample rate
10922050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-vctk-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_GB-vctk-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-vctk-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-vctk-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-vctk-medium.tar.bz2

You can use the following code to play with vits-piper-en_GB-vctk-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx",
            data_dir="vits-piper-en_GB-vctk-medium/espeak-ng-data",
            tokens="vits-piper-en_GB-vctk-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_GB-vctk-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_GB-vctk-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_GB-vctk-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_GB-vctk-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx".into()),
                tokens: Some("vits-piper-en_GB-vctk-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_GB-vctk-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_GB-vctk-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx',
        tokens: 'vits-piper-en_GB-vctk-medium/tokens.txt',
        dataDir: 'vits-piper-en_GB-vctk-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_GB-vctk-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx',
    tokens: 'vits-piper-en_GB-vctk-medium/tokens.txt',
    dataDir: 'vits-piper-en_GB-vctk-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_GB-vctk-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_GB-vctk-medium/tokens.txt",
    dataDir: "vits-piper-en_GB-vctk-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_GB-vctk-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-vctk-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-vctk-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_GB-vctk-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx",
        tokens = "vits-piper-en_GB-vctk-medium/tokens.txt",
        dataDir = "vits-piper-en_GB-vctk-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_GB-vctk-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx");
    vits.setTokens("vits-piper-en_GB-vctk-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_GB-vctk-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_GB-vctk-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_GB-vctk-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_GB-vctk-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_GB-vctk-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx",
				Tokens:  "vits-piper-en_GB-vctk-medium/tokens.txt",
				DataDir: "vits-piper-en_GB-vctk-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

Speaker 1

Speaker 2

Speaker 3

Speaker 4

Speaker 5

Speaker 6

Speaker 7

Speaker 8

Speaker 9

Speaker 10

Speaker 11

Speaker 12

Speaker 13

Speaker 14

Speaker 15

Speaker 16

Speaker 17

Speaker 18

Speaker 19

Speaker 20

Speaker 21

Speaker 22

Speaker 23

Speaker 24

Speaker 25

Speaker 26

Speaker 27

Speaker 28

Speaker 29

Speaker 30

Speaker 31

Speaker 32

Speaker 33

Speaker 34

Speaker 35

Speaker 36

Speaker 37

Speaker 38

Speaker 39

Speaker 40

Speaker 41

Speaker 42

Speaker 43

Speaker 44

Speaker 45

Speaker 46

Speaker 47

Speaker 48

Speaker 49

Speaker 50

Speaker 51

Speaker 52

Speaker 53

Speaker 54

Speaker 55

Speaker 56

Speaker 57

Speaker 58

Speaker 59

Speaker 60

Speaker 61

Speaker 62

Speaker 63

Speaker 64

Speaker 65

Speaker 66

Speaker 67

Speaker 68

Speaker 69

Speaker 70

Speaker 71

Speaker 72

Speaker 73

Speaker 74

Speaker 75

Speaker 76

Speaker 77

Speaker 78

Speaker 79

Speaker 80

Speaker 81

Speaker 82

Speaker 83

Speaker 84

Speaker 85

Speaker 86

Speaker 87

Speaker 88

Speaker 89

Speaker 90

Speaker 91

Speaker 92

Speaker 93

Speaker 94

Speaker 95

Speaker 96

Speaker 97

Speaker 98

Speaker 99

Speaker 100

Speaker 101

Speaker 102

Speaker 103

Speaker 104

Speaker 105

Speaker 106

Speaker 107

Speaker 108

vits-piper-en_US-amy-low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/amy/low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-amy-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-amy-low/en_US-amy-low.onnx";
  config.model.vits.tokens = "vits-piper-en_US-amy-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-amy-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2

You can use the following code to play with vits-piper-en_US-amy-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-amy-low/en_US-amy-low.onnx",
            data_dir="vits-piper-en_US-amy-low/espeak-ng-data",
            tokens="vits-piper-en_US-amy-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-amy-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-amy-low/en_US-amy-low.onnx";
  config.model.vits.tokens = "vits-piper-en_US-amy-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-amy-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-amy-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-amy-low/en_US-amy-low.onnx".into()),
                tokens: Some("vits-piper-en_US-amy-low/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-amy-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-amy-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-amy-low/en_US-amy-low.onnx',
        tokens: 'vits-piper-en_US-amy-low/tokens.txt',
        dataDir: 'vits-piper-en_US-amy-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-amy-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-amy-low/en_US-amy-low.onnx',
    tokens: 'vits-piper-en_US-amy-low/tokens.txt',
    dataDir: 'vits-piper-en_US-amy-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-amy-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-amy-low/en_US-amy-low.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-amy-low/tokens.txt",
    dataDir: "vits-piper-en_US-amy-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-amy-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-amy-low/en_US-amy-low.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-amy-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-amy-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-amy-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-amy-low/en_US-amy-low.onnx",
        tokens = "vits-piper-en_US-amy-low/tokens.txt",
        dataDir = "vits-piper-en_US-amy-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-amy-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-amy-low/en_US-amy-low.onnx");
    vits.setTokens("vits-piper-en_US-amy-low/tokens.txt");
    vits.setDataDir("vits-piper-en_US-amy-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-amy-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-amy-low/en_US-amy-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-amy-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-amy-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-amy-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-amy-low/en_US-amy-low.onnx",
				Tokens:  "vits-piper-en_US-amy-low/tokens.txt",
				DataDir: "vits-piper-en_US-amy-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-amy-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/amy/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-amy-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-amy-medium/en_US-amy-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-amy-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-amy-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-medium.tar.bz2

You can use the following code to play with vits-piper-en_US-amy-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-amy-medium/en_US-amy-medium.onnx",
            data_dir="vits-piper-en_US-amy-medium/espeak-ng-data",
            tokens="vits-piper-en_US-amy-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-amy-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-amy-medium/en_US-amy-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-amy-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-amy-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-amy-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-amy-medium/en_US-amy-medium.onnx".into()),
                tokens: Some("vits-piper-en_US-amy-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-amy-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-amy-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-amy-medium/en_US-amy-medium.onnx',
        tokens: 'vits-piper-en_US-amy-medium/tokens.txt',
        dataDir: 'vits-piper-en_US-amy-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-amy-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-amy-medium/en_US-amy-medium.onnx',
    tokens: 'vits-piper-en_US-amy-medium/tokens.txt',
    dataDir: 'vits-piper-en_US-amy-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-amy-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-amy-medium/en_US-amy-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-amy-medium/tokens.txt",
    dataDir: "vits-piper-en_US-amy-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-amy-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-amy-medium/en_US-amy-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-amy-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-amy-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-amy-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-amy-medium/en_US-amy-medium.onnx",
        tokens = "vits-piper-en_US-amy-medium/tokens.txt",
        dataDir = "vits-piper-en_US-amy-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-amy-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-amy-medium/en_US-amy-medium.onnx");
    vits.setTokens("vits-piper-en_US-amy-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_US-amy-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-amy-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-amy-medium/en_US-amy-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-amy-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-amy-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-amy-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-amy-medium/en_US-amy-medium.onnx",
				Tokens:  "vits-piper-en_US-amy-medium/tokens.txt",
				DataDir: "vits-piper-en_US-amy-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-arctic-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/arctic/medium

Number of speakersSample rate
1822050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-arctic-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-arctic-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-arctic-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-arctic-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-arctic-medium.tar.bz2

You can use the following code to play with vits-piper-en_US-arctic-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx",
            data_dir="vits-piper-en_US-arctic-medium/espeak-ng-data",
            tokens="vits-piper-en_US-arctic-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-arctic-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-arctic-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-arctic-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-arctic-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx".into()),
                tokens: Some("vits-piper-en_US-arctic-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-arctic-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-arctic-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx',
        tokens: 'vits-piper-en_US-arctic-medium/tokens.txt',
        dataDir: 'vits-piper-en_US-arctic-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-arctic-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx',
    tokens: 'vits-piper-en_US-arctic-medium/tokens.txt',
    dataDir: 'vits-piper-en_US-arctic-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-arctic-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-arctic-medium/tokens.txt",
    dataDir: "vits-piper-en_US-arctic-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-arctic-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-arctic-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-arctic-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-arctic-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx",
        tokens = "vits-piper-en_US-arctic-medium/tokens.txt",
        dataDir = "vits-piper-en_US-arctic-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-arctic-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx");
    vits.setTokens("vits-piper-en_US-arctic-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_US-arctic-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-arctic-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-arctic-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-arctic-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-arctic-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx",
				Tokens:  "vits-piper-en_US-arctic-medium/tokens.txt",
				DataDir: "vits-piper-en_US-arctic-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

Speaker 1

Speaker 2

Speaker 3

Speaker 4

Speaker 5

Speaker 6

Speaker 7

Speaker 8

Speaker 9

Speaker 10

Speaker 11

Speaker 12

Speaker 13

Speaker 14

Speaker 15

Speaker 16

Speaker 17

vits-piper-en_US-bryce-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/bryce/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-bryce-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-bryce-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-bryce-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-bryce-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-bryce-medium.tar.bz2

You can use the following code to play with vits-piper-en_US-bryce-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx",
            data_dir="vits-piper-en_US-bryce-medium/espeak-ng-data",
            tokens="vits-piper-en_US-bryce-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-bryce-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-bryce-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-bryce-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-bryce-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx".into()),
                tokens: Some("vits-piper-en_US-bryce-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-bryce-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-bryce-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx',
        tokens: 'vits-piper-en_US-bryce-medium/tokens.txt',
        dataDir: 'vits-piper-en_US-bryce-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-bryce-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx',
    tokens: 'vits-piper-en_US-bryce-medium/tokens.txt',
    dataDir: 'vits-piper-en_US-bryce-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-bryce-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-bryce-medium/tokens.txt",
    dataDir: "vits-piper-en_US-bryce-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-bryce-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-bryce-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-bryce-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-bryce-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx",
        tokens = "vits-piper-en_US-bryce-medium/tokens.txt",
        dataDir = "vits-piper-en_US-bryce-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-bryce-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx");
    vits.setTokens("vits-piper-en_US-bryce-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_US-bryce-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-bryce-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-bryce-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-bryce-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-bryce-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx",
				Tokens:  "vits-piper-en_US-bryce-medium/tokens.txt",
				DataDir: "vits-piper-en_US-bryce-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-danny-low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/danny/low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-danny-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-danny-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-danny-low/en_US-danny-low.onnx";
  config.model.vits.tokens = "vits-piper-en_US-danny-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-danny-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-danny-low.tar.bz2

You can use the following code to play with vits-piper-en_US-danny-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-danny-low/en_US-danny-low.onnx",
            data_dir="vits-piper-en_US-danny-low/espeak-ng-data",
            tokens="vits-piper-en_US-danny-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-danny-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-danny-low/en_US-danny-low.onnx";
  config.model.vits.tokens = "vits-piper-en_US-danny-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-danny-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-danny-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-danny-low/en_US-danny-low.onnx".into()),
                tokens: Some("vits-piper-en_US-danny-low/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-danny-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-danny-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-danny-low/en_US-danny-low.onnx',
        tokens: 'vits-piper-en_US-danny-low/tokens.txt',
        dataDir: 'vits-piper-en_US-danny-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-danny-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-danny-low/en_US-danny-low.onnx',
    tokens: 'vits-piper-en_US-danny-low/tokens.txt',
    dataDir: 'vits-piper-en_US-danny-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-danny-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-danny-low/en_US-danny-low.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-danny-low/tokens.txt",
    dataDir: "vits-piper-en_US-danny-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-danny-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-danny-low/en_US-danny-low.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-danny-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-danny-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-danny-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-danny-low/en_US-danny-low.onnx",
        tokens = "vits-piper-en_US-danny-low/tokens.txt",
        dataDir = "vits-piper-en_US-danny-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-danny-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-danny-low/en_US-danny-low.onnx");
    vits.setTokens("vits-piper-en_US-danny-low/tokens.txt");
    vits.setDataDir("vits-piper-en_US-danny-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-danny-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-danny-low/en_US-danny-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-danny-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-danny-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-danny-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-danny-low/en_US-danny-low.onnx",
				Tokens:  "vits-piper-en_US-danny-low/tokens.txt",
				DataDir: "vits-piper-en_US-danny-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-glados-high

Info about this model

This model is converted from https://github.com/rhasspy/piper/issues/187#issuecomment-1805709037

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-glados-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-glados-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-glados-high/en_US-glados-high.onnx";
  config.model.vits.tokens = "vits-piper-en_US-glados-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-glados-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-glados-high.tar.bz2

You can use the following code to play with vits-piper-en_US-glados-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-glados-high/en_US-glados-high.onnx",
            data_dir="vits-piper-en_US-glados-high/espeak-ng-data",
            tokens="vits-piper-en_US-glados-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-glados-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-glados-high/en_US-glados-high.onnx";
  config.model.vits.tokens = "vits-piper-en_US-glados-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-glados-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-glados-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-glados-high/en_US-glados-high.onnx".into()),
                tokens: Some("vits-piper-en_US-glados-high/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-glados-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-glados-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-glados-high/en_US-glados-high.onnx',
        tokens: 'vits-piper-en_US-glados-high/tokens.txt',
        dataDir: 'vits-piper-en_US-glados-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-glados-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-glados-high/en_US-glados-high.onnx',
    tokens: 'vits-piper-en_US-glados-high/tokens.txt',
    dataDir: 'vits-piper-en_US-glados-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-glados-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-glados-high/en_US-glados-high.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-glados-high/tokens.txt",
    dataDir: "vits-piper-en_US-glados-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-glados-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-glados-high/en_US-glados-high.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-glados-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-glados-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-glados-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-glados-high/en_US-glados-high.onnx",
        tokens = "vits-piper-en_US-glados-high/tokens.txt",
        dataDir = "vits-piper-en_US-glados-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-glados-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-glados-high/en_US-glados-high.onnx");
    vits.setTokens("vits-piper-en_US-glados-high/tokens.txt");
    vits.setDataDir("vits-piper-en_US-glados-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-glados-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-glados-high/en_US-glados-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-glados-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-glados-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-glados-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-glados-high/en_US-glados-high.onnx",
				Tokens:  "vits-piper-en_US-glados-high/tokens.txt",
				DataDir: "vits-piper-en_US-glados-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-hfc_female-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/hfc_female/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-hfc_female-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-hfc_female-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-hfc_female-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-hfc_female-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-hfc_female-medium.tar.bz2

You can use the following code to play with vits-piper-en_US-hfc_female-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx",
            data_dir="vits-piper-en_US-hfc_female-medium/espeak-ng-data",
            tokens="vits-piper-en_US-hfc_female-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-hfc_female-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-hfc_female-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-hfc_female-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-hfc_female-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx".into()),
                tokens: Some("vits-piper-en_US-hfc_female-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-hfc_female-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-hfc_female-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx',
        tokens: 'vits-piper-en_US-hfc_female-medium/tokens.txt',
        dataDir: 'vits-piper-en_US-hfc_female-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-hfc_female-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx',
    tokens: 'vits-piper-en_US-hfc_female-medium/tokens.txt',
    dataDir: 'vits-piper-en_US-hfc_female-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-hfc_female-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-hfc_female-medium/tokens.txt",
    dataDir: "vits-piper-en_US-hfc_female-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-hfc_female-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-hfc_female-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-hfc_female-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-hfc_female-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx",
        tokens = "vits-piper-en_US-hfc_female-medium/tokens.txt",
        dataDir = "vits-piper-en_US-hfc_female-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-hfc_female-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx");
    vits.setTokens("vits-piper-en_US-hfc_female-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_US-hfc_female-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-hfc_female-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-hfc_female-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-hfc_female-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-hfc_female-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx",
				Tokens:  "vits-piper-en_US-hfc_female-medium/tokens.txt",
				DataDir: "vits-piper-en_US-hfc_female-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-hfc_male-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/hfc_male/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-hfc_male-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-hfc_male-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-hfc_male-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-hfc_male-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-hfc_male-medium.tar.bz2

You can use the following code to play with vits-piper-en_US-hfc_male-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx",
            data_dir="vits-piper-en_US-hfc_male-medium/espeak-ng-data",
            tokens="vits-piper-en_US-hfc_male-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-hfc_male-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-hfc_male-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-hfc_male-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-hfc_male-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx".into()),
                tokens: Some("vits-piper-en_US-hfc_male-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-hfc_male-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-hfc_male-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx',
        tokens: 'vits-piper-en_US-hfc_male-medium/tokens.txt',
        dataDir: 'vits-piper-en_US-hfc_male-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-hfc_male-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx',
    tokens: 'vits-piper-en_US-hfc_male-medium/tokens.txt',
    dataDir: 'vits-piper-en_US-hfc_male-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-hfc_male-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-hfc_male-medium/tokens.txt",
    dataDir: "vits-piper-en_US-hfc_male-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-hfc_male-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-hfc_male-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-hfc_male-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-hfc_male-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx",
        tokens = "vits-piper-en_US-hfc_male-medium/tokens.txt",
        dataDir = "vits-piper-en_US-hfc_male-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-hfc_male-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx");
    vits.setTokens("vits-piper-en_US-hfc_male-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_US-hfc_male-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-hfc_male-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-hfc_male-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-hfc_male-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-hfc_male-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx",
				Tokens:  "vits-piper-en_US-hfc_male-medium/tokens.txt",
				DataDir: "vits-piper-en_US-hfc_male-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-joe-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/joe/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-joe-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-joe-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-joe-medium/en_US-joe-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-joe-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-joe-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-joe-medium.tar.bz2

You can use the following code to play with vits-piper-en_US-joe-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-joe-medium/en_US-joe-medium.onnx",
            data_dir="vits-piper-en_US-joe-medium/espeak-ng-data",
            tokens="vits-piper-en_US-joe-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-joe-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-joe-medium/en_US-joe-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-joe-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-joe-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-joe-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-joe-medium/en_US-joe-medium.onnx".into()),
                tokens: Some("vits-piper-en_US-joe-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-joe-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-joe-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-joe-medium/en_US-joe-medium.onnx',
        tokens: 'vits-piper-en_US-joe-medium/tokens.txt',
        dataDir: 'vits-piper-en_US-joe-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-joe-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-joe-medium/en_US-joe-medium.onnx',
    tokens: 'vits-piper-en_US-joe-medium/tokens.txt',
    dataDir: 'vits-piper-en_US-joe-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-joe-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-joe-medium/en_US-joe-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-joe-medium/tokens.txt",
    dataDir: "vits-piper-en_US-joe-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-joe-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-joe-medium/en_US-joe-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-joe-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-joe-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-joe-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-joe-medium/en_US-joe-medium.onnx",
        tokens = "vits-piper-en_US-joe-medium/tokens.txt",
        dataDir = "vits-piper-en_US-joe-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-joe-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-joe-medium/en_US-joe-medium.onnx");
    vits.setTokens("vits-piper-en_US-joe-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_US-joe-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-joe-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-joe-medium/en_US-joe-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-joe-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-joe-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-joe-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-joe-medium/en_US-joe-medium.onnx",
				Tokens:  "vits-piper-en_US-joe-medium/tokens.txt",
				DataDir: "vits-piper-en_US-joe-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-john-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/john/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-john-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-john-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-john-medium/en_US-john-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-john-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-john-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-john-medium.tar.bz2

You can use the following code to play with vits-piper-en_US-john-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-john-medium/en_US-john-medium.onnx",
            data_dir="vits-piper-en_US-john-medium/espeak-ng-data",
            tokens="vits-piper-en_US-john-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-john-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-john-medium/en_US-john-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-john-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-john-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-john-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-john-medium/en_US-john-medium.onnx".into()),
                tokens: Some("vits-piper-en_US-john-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-john-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-john-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-john-medium/en_US-john-medium.onnx',
        tokens: 'vits-piper-en_US-john-medium/tokens.txt',
        dataDir: 'vits-piper-en_US-john-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-john-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-john-medium/en_US-john-medium.onnx',
    tokens: 'vits-piper-en_US-john-medium/tokens.txt',
    dataDir: 'vits-piper-en_US-john-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-john-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-john-medium/en_US-john-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-john-medium/tokens.txt",
    dataDir: "vits-piper-en_US-john-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-john-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-john-medium/en_US-john-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-john-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-john-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-john-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-john-medium/en_US-john-medium.onnx",
        tokens = "vits-piper-en_US-john-medium/tokens.txt",
        dataDir = "vits-piper-en_US-john-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-john-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-john-medium/en_US-john-medium.onnx");
    vits.setTokens("vits-piper-en_US-john-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_US-john-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-john-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-john-medium/en_US-john-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-john-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-john-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-john-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-john-medium/en_US-john-medium.onnx",
				Tokens:  "vits-piper-en_US-john-medium/tokens.txt",
				DataDir: "vits-piper-en_US-john-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-kathleen-low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/kathleen/low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-kathleen-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-kathleen-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx";
  config.model.vits.tokens = "vits-piper-en_US-kathleen-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-kathleen-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-kathleen-low.tar.bz2

You can use the following code to play with vits-piper-en_US-kathleen-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx",
            data_dir="vits-piper-en_US-kathleen-low/espeak-ng-data",
            tokens="vits-piper-en_US-kathleen-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-kathleen-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx";
  config.model.vits.tokens = "vits-piper-en_US-kathleen-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-kathleen-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-kathleen-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx".into()),
                tokens: Some("vits-piper-en_US-kathleen-low/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-kathleen-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-kathleen-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx',
        tokens: 'vits-piper-en_US-kathleen-low/tokens.txt',
        dataDir: 'vits-piper-en_US-kathleen-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-kathleen-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx',
    tokens: 'vits-piper-en_US-kathleen-low/tokens.txt',
    dataDir: 'vits-piper-en_US-kathleen-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-kathleen-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-kathleen-low/tokens.txt",
    dataDir: "vits-piper-en_US-kathleen-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-kathleen-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-kathleen-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-kathleen-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-kathleen-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx",
        tokens = "vits-piper-en_US-kathleen-low/tokens.txt",
        dataDir = "vits-piper-en_US-kathleen-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-kathleen-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx");
    vits.setTokens("vits-piper-en_US-kathleen-low/tokens.txt");
    vits.setDataDir("vits-piper-en_US-kathleen-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-kathleen-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-kathleen-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-kathleen-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-kathleen-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx",
				Tokens:  "vits-piper-en_US-kathleen-low/tokens.txt",
				DataDir: "vits-piper-en_US-kathleen-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-kristin-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/kristin/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-kristin-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-kristin-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-kristin-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-kristin-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-kristin-medium.tar.bz2

You can use the following code to play with vits-piper-en_US-kristin-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx",
            data_dir="vits-piper-en_US-kristin-medium/espeak-ng-data",
            tokens="vits-piper-en_US-kristin-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-kristin-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-kristin-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-kristin-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-kristin-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx".into()),
                tokens: Some("vits-piper-en_US-kristin-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-kristin-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-kristin-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx',
        tokens: 'vits-piper-en_US-kristin-medium/tokens.txt',
        dataDir: 'vits-piper-en_US-kristin-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-kristin-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx',
    tokens: 'vits-piper-en_US-kristin-medium/tokens.txt',
    dataDir: 'vits-piper-en_US-kristin-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-kristin-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-kristin-medium/tokens.txt",
    dataDir: "vits-piper-en_US-kristin-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-kristin-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-kristin-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-kristin-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-kristin-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx",
        tokens = "vits-piper-en_US-kristin-medium/tokens.txt",
        dataDir = "vits-piper-en_US-kristin-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-kristin-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx");
    vits.setTokens("vits-piper-en_US-kristin-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_US-kristin-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-kristin-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-kristin-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-kristin-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-kristin-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx",
				Tokens:  "vits-piper-en_US-kristin-medium/tokens.txt",
				DataDir: "vits-piper-en_US-kristin-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-kusal-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/kusal/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-kusal-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-kusal-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-kusal-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-kusal-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-kusal-medium.tar.bz2

You can use the following code to play with vits-piper-en_US-kusal-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx",
            data_dir="vits-piper-en_US-kusal-medium/espeak-ng-data",
            tokens="vits-piper-en_US-kusal-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-kusal-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-kusal-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-kusal-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-kusal-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx".into()),
                tokens: Some("vits-piper-en_US-kusal-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-kusal-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-kusal-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx',
        tokens: 'vits-piper-en_US-kusal-medium/tokens.txt',
        dataDir: 'vits-piper-en_US-kusal-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-kusal-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx',
    tokens: 'vits-piper-en_US-kusal-medium/tokens.txt',
    dataDir: 'vits-piper-en_US-kusal-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-kusal-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-kusal-medium/tokens.txt",
    dataDir: "vits-piper-en_US-kusal-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-kusal-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-kusal-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-kusal-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-kusal-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx",
        tokens = "vits-piper-en_US-kusal-medium/tokens.txt",
        dataDir = "vits-piper-en_US-kusal-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-kusal-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx");
    vits.setTokens("vits-piper-en_US-kusal-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_US-kusal-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-kusal-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-kusal-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-kusal-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-kusal-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx",
				Tokens:  "vits-piper-en_US-kusal-medium/tokens.txt",
				DataDir: "vits-piper-en_US-kusal-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-l2arctic-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/l2arctic/medium

Number of speakersSample rate
2422050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-l2arctic-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-l2arctic-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-l2arctic-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-l2arctic-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-l2arctic-medium.tar.bz2

You can use the following code to play with vits-piper-en_US-l2arctic-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx",
            data_dir="vits-piper-en_US-l2arctic-medium/espeak-ng-data",
            tokens="vits-piper-en_US-l2arctic-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-l2arctic-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-l2arctic-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-l2arctic-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-l2arctic-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx".into()),
                tokens: Some("vits-piper-en_US-l2arctic-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-l2arctic-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-l2arctic-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx',
        tokens: 'vits-piper-en_US-l2arctic-medium/tokens.txt',
        dataDir: 'vits-piper-en_US-l2arctic-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-l2arctic-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx',
    tokens: 'vits-piper-en_US-l2arctic-medium/tokens.txt',
    dataDir: 'vits-piper-en_US-l2arctic-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-l2arctic-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-l2arctic-medium/tokens.txt",
    dataDir: "vits-piper-en_US-l2arctic-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-l2arctic-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-l2arctic-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-l2arctic-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-l2arctic-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx",
        tokens = "vits-piper-en_US-l2arctic-medium/tokens.txt",
        dataDir = "vits-piper-en_US-l2arctic-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-l2arctic-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx");
    vits.setTokens("vits-piper-en_US-l2arctic-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_US-l2arctic-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-l2arctic-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-l2arctic-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-l2arctic-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-l2arctic-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx",
				Tokens:  "vits-piper-en_US-l2arctic-medium/tokens.txt",
				DataDir: "vits-piper-en_US-l2arctic-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

Speaker 1

Speaker 2

Speaker 3

Speaker 4

Speaker 5

Speaker 6

Speaker 7

Speaker 8

Speaker 9

Speaker 10

Speaker 11

Speaker 12

Speaker 13

Speaker 14

Speaker 15

Speaker 16

Speaker 17

Speaker 18

Speaker 19

Speaker 20

Speaker 21

Speaker 22

Speaker 23

vits-piper-en_US-lessac-high

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/lessac/high

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-lessac-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-lessac-high/en_US-lessac-high.onnx";
  config.model.vits.tokens = "vits-piper-en_US-lessac-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-lessac-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-lessac-high.tar.bz2

You can use the following code to play with vits-piper-en_US-lessac-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-lessac-high/en_US-lessac-high.onnx",
            data_dir="vits-piper-en_US-lessac-high/espeak-ng-data",
            tokens="vits-piper-en_US-lessac-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-lessac-high/en_US-lessac-high.onnx";
  config.model.vits.tokens = "vits-piper-en_US-lessac-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-lessac-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-lessac-high/en_US-lessac-high.onnx".into()),
                tokens: Some("vits-piper-en_US-lessac-high/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-lessac-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-lessac-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-lessac-high/en_US-lessac-high.onnx',
        tokens: 'vits-piper-en_US-lessac-high/tokens.txt',
        dataDir: 'vits-piper-en_US-lessac-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-lessac-high/en_US-lessac-high.onnx',
    tokens: 'vits-piper-en_US-lessac-high/tokens.txt',
    dataDir: 'vits-piper-en_US-lessac-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-lessac-high/en_US-lessac-high.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-lessac-high/tokens.txt",
    dataDir: "vits-piper-en_US-lessac-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-lessac-high/en_US-lessac-high.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-lessac-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-lessac-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-lessac-high/en_US-lessac-high.onnx",
        tokens = "vits-piper-en_US-lessac-high/tokens.txt",
        dataDir = "vits-piper-en_US-lessac-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-lessac-high/en_US-lessac-high.onnx");
    vits.setTokens("vits-piper-en_US-lessac-high/tokens.txt");
    vits.setDataDir("vits-piper-en_US-lessac-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-lessac-high/en_US-lessac-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-lessac-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-lessac-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-lessac-high/en_US-lessac-high.onnx",
				Tokens:  "vits-piper-en_US-lessac-high/tokens.txt",
				DataDir: "vits-piper-en_US-lessac-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-lessac-low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/lessac/low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-lessac-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-lessac-low/en_US-lessac-low.onnx";
  config.model.vits.tokens = "vits-piper-en_US-lessac-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-lessac-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-lessac-low.tar.bz2

You can use the following code to play with vits-piper-en_US-lessac-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-lessac-low/en_US-lessac-low.onnx",
            data_dir="vits-piper-en_US-lessac-low/espeak-ng-data",
            tokens="vits-piper-en_US-lessac-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-lessac-low/en_US-lessac-low.onnx";
  config.model.vits.tokens = "vits-piper-en_US-lessac-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-lessac-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-lessac-low/en_US-lessac-low.onnx".into()),
                tokens: Some("vits-piper-en_US-lessac-low/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-lessac-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-lessac-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-lessac-low/en_US-lessac-low.onnx',
        tokens: 'vits-piper-en_US-lessac-low/tokens.txt',
        dataDir: 'vits-piper-en_US-lessac-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-lessac-low/en_US-lessac-low.onnx',
    tokens: 'vits-piper-en_US-lessac-low/tokens.txt',
    dataDir: 'vits-piper-en_US-lessac-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-lessac-low/en_US-lessac-low.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-lessac-low/tokens.txt",
    dataDir: "vits-piper-en_US-lessac-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-lessac-low/en_US-lessac-low.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-lessac-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-lessac-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-lessac-low/en_US-lessac-low.onnx",
        tokens = "vits-piper-en_US-lessac-low/tokens.txt",
        dataDir = "vits-piper-en_US-lessac-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-lessac-low/en_US-lessac-low.onnx");
    vits.setTokens("vits-piper-en_US-lessac-low/tokens.txt");
    vits.setDataDir("vits-piper-en_US-lessac-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-lessac-low/en_US-lessac-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-lessac-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-lessac-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-lessac-low/en_US-lessac-low.onnx",
				Tokens:  "vits-piper-en_US-lessac-low/tokens.txt",
				DataDir: "vits-piper-en_US-lessac-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-lessac-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/lessac/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-lessac-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-lessac-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-lessac-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-lessac-medium.tar.bz2

You can use the following code to play with vits-piper-en_US-lessac-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx",
            data_dir="vits-piper-en_US-lessac-medium/espeak-ng-data",
            tokens="vits-piper-en_US-lessac-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-lessac-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-lessac-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx".into()),
                tokens: Some("vits-piper-en_US-lessac-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-lessac-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-lessac-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx',
        tokens: 'vits-piper-en_US-lessac-medium/tokens.txt',
        dataDir: 'vits-piper-en_US-lessac-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx',
    tokens: 'vits-piper-en_US-lessac-medium/tokens.txt',
    dataDir: 'vits-piper-en_US-lessac-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-lessac-medium/tokens.txt",
    dataDir: "vits-piper-en_US-lessac-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-lessac-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-lessac-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx",
        tokens = "vits-piper-en_US-lessac-medium/tokens.txt",
        dataDir = "vits-piper-en_US-lessac-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx");
    vits.setTokens("vits-piper-en_US-lessac-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_US-lessac-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-lessac-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-lessac-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-lessac-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx",
				Tokens:  "vits-piper-en_US-lessac-medium/tokens.txt",
				DataDir: "vits-piper-en_US-lessac-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-libritts-high

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/libritts/high

Number of speakersSample rate
90422050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-libritts-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-libritts-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-libritts-high/en_US-libritts-high.onnx";
  config.model.vits.tokens = "vits-piper-en_US-libritts-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-libritts-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-libritts-high.tar.bz2

You can use the following code to play with vits-piper-en_US-libritts-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-libritts-high/en_US-libritts-high.onnx",
            data_dir="vits-piper-en_US-libritts-high/espeak-ng-data",
            tokens="vits-piper-en_US-libritts-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-libritts-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-libritts-high/en_US-libritts-high.onnx";
  config.model.vits.tokens = "vits-piper-en_US-libritts-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-libritts-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-libritts-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-libritts-high/en_US-libritts-high.onnx".into()),
                tokens: Some("vits-piper-en_US-libritts-high/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-libritts-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-libritts-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-libritts-high/en_US-libritts-high.onnx',
        tokens: 'vits-piper-en_US-libritts-high/tokens.txt',
        dataDir: 'vits-piper-en_US-libritts-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-libritts-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-libritts-high/en_US-libritts-high.onnx',
    tokens: 'vits-piper-en_US-libritts-high/tokens.txt',
    dataDir: 'vits-piper-en_US-libritts-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-libritts-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-libritts-high/en_US-libritts-high.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-libritts-high/tokens.txt",
    dataDir: "vits-piper-en_US-libritts-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-libritts-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-libritts-high/en_US-libritts-high.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-libritts-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-libritts-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-libritts-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-libritts-high/en_US-libritts-high.onnx",
        tokens = "vits-piper-en_US-libritts-high/tokens.txt",
        dataDir = "vits-piper-en_US-libritts-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-libritts-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-libritts-high/en_US-libritts-high.onnx");
    vits.setTokens("vits-piper-en_US-libritts-high/tokens.txt");
    vits.setDataDir("vits-piper-en_US-libritts-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-libritts-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-libritts-high/en_US-libritts-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-libritts-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-libritts-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-libritts-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-libritts-high/en_US-libritts-high.onnx",
				Tokens:  "vits-piper-en_US-libritts-high/tokens.txt",
				DataDir: "vits-piper-en_US-libritts-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

Speaker 1

Speaker 2

Speaker 3

Speaker 4

Speaker 5

Speaker 6

Speaker 7

Speaker 8

Speaker 9

Speaker 10

Speaker 11

Speaker 12

Speaker 13

Speaker 14

Speaker 15

Speaker 16

Speaker 17

Speaker 18

Speaker 19

Speaker 20

Speaker 21

Speaker 22

Speaker 23

Speaker 24

Speaker 25

Speaker 26

Speaker 27

Speaker 28

Speaker 29

Speaker 30

Speaker 31

Speaker 32

Speaker 33

Speaker 34

Speaker 35

Speaker 36

Speaker 37

Speaker 38

Speaker 39

Speaker 40

Speaker 41

Speaker 42

Speaker 43

Speaker 44

Speaker 45

Speaker 46

Speaker 47

Speaker 48

Speaker 49

Speaker 50

Speaker 51

Speaker 52

Speaker 53

Speaker 54

Speaker 55

Speaker 56

Speaker 57

Speaker 58

Speaker 59

Speaker 60

Speaker 61

Speaker 62

Speaker 63

Speaker 64

Speaker 65

Speaker 66

Speaker 67

Speaker 68

Speaker 69

Speaker 70

Speaker 71

Speaker 72

Speaker 73

Speaker 74

Speaker 75

Speaker 76

Speaker 77

Speaker 78

Speaker 79

Speaker 80

Speaker 81

Speaker 82

Speaker 83

Speaker 84

Speaker 85

Speaker 86

Speaker 87

Speaker 88

Speaker 89

Speaker 90

Speaker 91

Speaker 92

Speaker 93

Speaker 94

Speaker 95

Speaker 96

Speaker 97

Speaker 98

Speaker 99

Speaker 100

Speaker 101

Speaker 102

Speaker 103

Speaker 104

Speaker 105

Speaker 106

Speaker 107

Speaker 108

Speaker 109

Speaker 110

Speaker 111

Speaker 112

Speaker 113

Speaker 114

Speaker 115

Speaker 116

Speaker 117

Speaker 118

Speaker 119

Speaker 120

Speaker 121

Speaker 122

Speaker 123

Speaker 124

Speaker 125

Speaker 126

Speaker 127

Speaker 128

Speaker 129

Speaker 130

Speaker 131

Speaker 132

Speaker 133

Speaker 134

Speaker 135

Speaker 136

Speaker 137

Speaker 138

Speaker 139

Speaker 140

Speaker 141

Speaker 142

Speaker 143

Speaker 144

Speaker 145

Speaker 146

Speaker 147

Speaker 148

Speaker 149

Speaker 150

Speaker 151

Speaker 152

Speaker 153

Speaker 154

Speaker 155

Speaker 156

Speaker 157

Speaker 158

Speaker 159

Speaker 160

Speaker 161

Speaker 162

Speaker 163

Speaker 164

Speaker 165

Speaker 166

Speaker 167

Speaker 168

Speaker 169

Speaker 170

Speaker 171

Speaker 172

Speaker 173

Speaker 174

Speaker 175

Speaker 176

Speaker 177

Speaker 178

Speaker 179

Speaker 180

Speaker 181

Speaker 182

Speaker 183

Speaker 184

Speaker 185

Speaker 186

Speaker 187

Speaker 188

Speaker 189

Speaker 190

Speaker 191

Speaker 192

Speaker 193

Speaker 194

Speaker 195

Speaker 196

Speaker 197

Speaker 198

Speaker 199

Speaker 200

Speaker 201

Speaker 202

Speaker 203

Speaker 204

Speaker 205

Speaker 206

Speaker 207

Speaker 208

Speaker 209

Speaker 210

Speaker 211

Speaker 212

Speaker 213

Speaker 214

Speaker 215

Speaker 216

Speaker 217

Speaker 218

Speaker 219

Speaker 220

Speaker 221

Speaker 222

Speaker 223

Speaker 224

Speaker 225

Speaker 226

Speaker 227

Speaker 228

Speaker 229

Speaker 230

Speaker 231

Speaker 232

Speaker 233

Speaker 234

Speaker 235

Speaker 236

Speaker 237

Speaker 238

Speaker 239

Speaker 240

Speaker 241

Speaker 242

Speaker 243

Speaker 244

Speaker 245

Speaker 246

Speaker 247

Speaker 248

Speaker 249

Speaker 250

Speaker 251

Speaker 252

Speaker 253

Speaker 254

Speaker 255

Speaker 256

Speaker 257

Speaker 258

Speaker 259

Speaker 260

Speaker 261

Speaker 262

Speaker 263

Speaker 264

Speaker 265

Speaker 266

Speaker 267

Speaker 268

Speaker 269

Speaker 270

Speaker 271

Speaker 272

Speaker 273

Speaker 274

Speaker 275

Speaker 276

Speaker 277

Speaker 278

Speaker 279

Speaker 280

Speaker 281

Speaker 282

Speaker 283

Speaker 284

Speaker 285

Speaker 286

Speaker 287

Speaker 288

Speaker 289

Speaker 290

Speaker 291

Speaker 292

Speaker 293

Speaker 294

Speaker 295

Speaker 296

Speaker 297

Speaker 298

Speaker 299

Speaker 300

Speaker 301

Speaker 302

Speaker 303

Speaker 304

Speaker 305

Speaker 306

Speaker 307

Speaker 308

Speaker 309

Speaker 310

Speaker 311

Speaker 312

Speaker 313

Speaker 314

Speaker 315

Speaker 316

Speaker 317

Speaker 318

Speaker 319

Speaker 320

Speaker 321

Speaker 322

Speaker 323

Speaker 324

Speaker 325

Speaker 326

Speaker 327

Speaker 328

Speaker 329

Speaker 330

Speaker 331

Speaker 332

Speaker 333

Speaker 334

Speaker 335

Speaker 336

Speaker 337

Speaker 338

Speaker 339

Speaker 340

Speaker 341

Speaker 342

Speaker 343

Speaker 344

Speaker 345

Speaker 346

Speaker 347

Speaker 348

Speaker 349

Speaker 350

Speaker 351

Speaker 352

Speaker 353

Speaker 354

Speaker 355

Speaker 356

Speaker 357

Speaker 358

Speaker 359

Speaker 360

Speaker 361

Speaker 362

Speaker 363

Speaker 364

Speaker 365

Speaker 366

Speaker 367

Speaker 368

Speaker 369

Speaker 370

Speaker 371

Speaker 372

Speaker 373

Speaker 374

Speaker 375

Speaker 376

Speaker 377

Speaker 378

Speaker 379

Speaker 380

Speaker 381

Speaker 382

Speaker 383

Speaker 384

Speaker 385

Speaker 386

Speaker 387

Speaker 388

Speaker 389

Speaker 390

Speaker 391

Speaker 392

Speaker 393

Speaker 394

Speaker 395

Speaker 396

Speaker 397

Speaker 398

Speaker 399

Speaker 400

Speaker 401

Speaker 402

Speaker 403

Speaker 404

Speaker 405

Speaker 406

Speaker 407

Speaker 408

Speaker 409

Speaker 410

Speaker 411

Speaker 412

Speaker 413

Speaker 414

Speaker 415

Speaker 416

Speaker 417

Speaker 418

Speaker 419

Speaker 420

Speaker 421

Speaker 422

Speaker 423

Speaker 424

Speaker 425

Speaker 426

Speaker 427

Speaker 428

Speaker 429

Speaker 430

Speaker 431

Speaker 432

Speaker 433

Speaker 434

Speaker 435

Speaker 436

Speaker 437

Speaker 438

Speaker 439

Speaker 440

Speaker 441

Speaker 442

Speaker 443

Speaker 444

Speaker 445

Speaker 446

Speaker 447

Speaker 448

Speaker 449

Speaker 450

Speaker 451

Speaker 452

Speaker 453

Speaker 454

Speaker 455

Speaker 456

Speaker 457

Speaker 458

Speaker 459

Speaker 460

Speaker 461

Speaker 462

Speaker 463

Speaker 464

Speaker 465

Speaker 466

Speaker 467

Speaker 468

Speaker 469

Speaker 470

Speaker 471

Speaker 472

Speaker 473

Speaker 474

Speaker 475

Speaker 476

Speaker 477

Speaker 478

Speaker 479

Speaker 480

Speaker 481

Speaker 482

Speaker 483

Speaker 484

Speaker 485

Speaker 486

Speaker 487

Speaker 488

Speaker 489

Speaker 490

Speaker 491

Speaker 492

Speaker 493

Speaker 494

Speaker 495

Speaker 496

Speaker 497

Speaker 498

Speaker 499

Speaker 500

Speaker 501

Speaker 502

Speaker 503

Speaker 504

Speaker 505

Speaker 506

Speaker 507

Speaker 508

Speaker 509

Speaker 510

Speaker 511

Speaker 512

Speaker 513

Speaker 514

Speaker 515

Speaker 516

Speaker 517

Speaker 518

Speaker 519

Speaker 520

Speaker 521

Speaker 522

Speaker 523

Speaker 524

Speaker 525

Speaker 526

Speaker 527

Speaker 528

Speaker 529

Speaker 530

Speaker 531

Speaker 532

Speaker 533

Speaker 534

Speaker 535

Speaker 536

Speaker 537

Speaker 538

Speaker 539

Speaker 540

Speaker 541

Speaker 542

Speaker 543

Speaker 544

Speaker 545

Speaker 546

Speaker 547

Speaker 548

Speaker 549

Speaker 550

Speaker 551

Speaker 552

Speaker 553

Speaker 554

Speaker 555

Speaker 556

Speaker 557

Speaker 558

Speaker 559

Speaker 560

Speaker 561

Speaker 562

Speaker 563

Speaker 564

Speaker 565

Speaker 566

Speaker 567

Speaker 568

Speaker 569

Speaker 570

Speaker 571

Speaker 572

Speaker 573

Speaker 574

Speaker 575

Speaker 576

Speaker 577

Speaker 578

Speaker 579

Speaker 580

Speaker 581

Speaker 582

Speaker 583

Speaker 584

Speaker 585

Speaker 586

Speaker 587

Speaker 588

Speaker 589

Speaker 590

Speaker 591

Speaker 592

Speaker 593

Speaker 594

Speaker 595

Speaker 596

Speaker 597

Speaker 598

Speaker 599

Speaker 600

Speaker 601

Speaker 602

Speaker 603

Speaker 604

Speaker 605

Speaker 606

Speaker 607

Speaker 608

Speaker 609

Speaker 610

Speaker 611

Speaker 612

Speaker 613

Speaker 614

Speaker 615

Speaker 616

Speaker 617

Speaker 618

Speaker 619

Speaker 620

Speaker 621

Speaker 622

Speaker 623

Speaker 624

Speaker 625

Speaker 626

Speaker 627

Speaker 628

Speaker 629

Speaker 630

Speaker 631

Speaker 632

Speaker 633

Speaker 634

Speaker 635

Speaker 636

Speaker 637

Speaker 638

Speaker 639

Speaker 640

Speaker 641

Speaker 642

Speaker 643

Speaker 644

Speaker 645

Speaker 646

Speaker 647

Speaker 648

Speaker 649

Speaker 650

Speaker 651

Speaker 652

Speaker 653

Speaker 654

Speaker 655

Speaker 656

Speaker 657

Speaker 658

Speaker 659

Speaker 660

Speaker 661

Speaker 662

Speaker 663

Speaker 664

Speaker 665

Speaker 666

Speaker 667

Speaker 668

Speaker 669

Speaker 670

Speaker 671

Speaker 672

Speaker 673

Speaker 674

Speaker 675

Speaker 676

Speaker 677

Speaker 678

Speaker 679

Speaker 680

Speaker 681

Speaker 682

Speaker 683

Speaker 684

Speaker 685

Speaker 686

Speaker 687

Speaker 688

Speaker 689

Speaker 690

Speaker 691

Speaker 692

Speaker 693

Speaker 694

Speaker 695

Speaker 696

Speaker 697

Speaker 698

Speaker 699

Speaker 700

Speaker 701

Speaker 702

Speaker 703

Speaker 704

Speaker 705

Speaker 706

Speaker 707

Speaker 708

Speaker 709

Speaker 710

Speaker 711

Speaker 712

Speaker 713

Speaker 714

Speaker 715

Speaker 716

Speaker 717

Speaker 718

Speaker 719

Speaker 720

Speaker 721

Speaker 722

Speaker 723

Speaker 724

Speaker 725

Speaker 726

Speaker 727

Speaker 728

Speaker 729

Speaker 730

Speaker 731

Speaker 732

Speaker 733

Speaker 734

Speaker 735

Speaker 736

Speaker 737

Speaker 738

Speaker 739

Speaker 740

Speaker 741

Speaker 742

Speaker 743

Speaker 744

Speaker 745

Speaker 746

Speaker 747

Speaker 748

Speaker 749

Speaker 750

Speaker 751

Speaker 752

Speaker 753

Speaker 754

Speaker 755

Speaker 756

Speaker 757

Speaker 758

Speaker 759

Speaker 760

Speaker 761

Speaker 762

Speaker 763

Speaker 764

Speaker 765

Speaker 766

Speaker 767

Speaker 768

Speaker 769

Speaker 770

Speaker 771

Speaker 772

Speaker 773

Speaker 774

Speaker 775

Speaker 776

Speaker 777

Speaker 778

Speaker 779

Speaker 780

Speaker 781

Speaker 782

Speaker 783

Speaker 784

Speaker 785

Speaker 786

Speaker 787

Speaker 788

Speaker 789

Speaker 790

Speaker 791

Speaker 792

Speaker 793

Speaker 794

Speaker 795

Speaker 796

Speaker 797

Speaker 798

Speaker 799

Speaker 800

Speaker 801

Speaker 802

Speaker 803

Speaker 804

Speaker 805

Speaker 806

Speaker 807

Speaker 808

Speaker 809

Speaker 810

Speaker 811

Speaker 812

Speaker 813

Speaker 814

Speaker 815

Speaker 816

Speaker 817

Speaker 818

Speaker 819

Speaker 820

Speaker 821

Speaker 822

Speaker 823

Speaker 824

Speaker 825

Speaker 826

Speaker 827

Speaker 828

Speaker 829

Speaker 830

Speaker 831

Speaker 832

Speaker 833

Speaker 834

Speaker 835

Speaker 836

Speaker 837

Speaker 838

Speaker 839

Speaker 840

Speaker 841

Speaker 842

Speaker 843

Speaker 844

Speaker 845

Speaker 846

Speaker 847

Speaker 848

Speaker 849

Speaker 850

Speaker 851

Speaker 852

Speaker 853

Speaker 854

Speaker 855

Speaker 856

Speaker 857

Speaker 858

Speaker 859

Speaker 860

Speaker 861

Speaker 862

Speaker 863

Speaker 864

Speaker 865

Speaker 866

Speaker 867

Speaker 868

Speaker 869

Speaker 870

Speaker 871

Speaker 872

Speaker 873

Speaker 874

Speaker 875

Speaker 876

Speaker 877

Speaker 878

Speaker 879

Speaker 880

Speaker 881

Speaker 882

Speaker 883

Speaker 884

Speaker 885

Speaker 886

Speaker 887

Speaker 888

Speaker 889

Speaker 890

Speaker 891

Speaker 892

Speaker 893

Speaker 894

Speaker 895

Speaker 896

Speaker 897

Speaker 898

Speaker 899

Speaker 900

Speaker 901

Speaker 902

Speaker 903

vits-piper-en_US-libritts_r-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/libritts_r/medium

Number of speakersSample rate
90422050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-libritts_r-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-libritts_r-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-libritts_r-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-libritts_r-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-libritts_r-medium.tar.bz2

You can use the following code to play with vits-piper-en_US-libritts_r-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx",
            data_dir="vits-piper-en_US-libritts_r-medium/espeak-ng-data",
            tokens="vits-piper-en_US-libritts_r-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-libritts_r-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-libritts_r-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-libritts_r-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-libritts_r-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx".into()),
                tokens: Some("vits-piper-en_US-libritts_r-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-libritts_r-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-libritts_r-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx',
        tokens: 'vits-piper-en_US-libritts_r-medium/tokens.txt',
        dataDir: 'vits-piper-en_US-libritts_r-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-libritts_r-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx',
    tokens: 'vits-piper-en_US-libritts_r-medium/tokens.txt',
    dataDir: 'vits-piper-en_US-libritts_r-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-libritts_r-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-libritts_r-medium/tokens.txt",
    dataDir: "vits-piper-en_US-libritts_r-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-libritts_r-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-libritts_r-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-libritts_r-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-libritts_r-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx",
        tokens = "vits-piper-en_US-libritts_r-medium/tokens.txt",
        dataDir = "vits-piper-en_US-libritts_r-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-libritts_r-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx");
    vits.setTokens("vits-piper-en_US-libritts_r-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_US-libritts_r-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-libritts_r-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-libritts_r-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-libritts_r-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-libritts_r-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx",
				Tokens:  "vits-piper-en_US-libritts_r-medium/tokens.txt",
				DataDir: "vits-piper-en_US-libritts_r-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

Speaker 1

Speaker 2

Speaker 3

Speaker 4

Speaker 5

Speaker 6

Speaker 7

Speaker 8

Speaker 9

Speaker 10

Speaker 11

Speaker 12

Speaker 13

Speaker 14

Speaker 15

Speaker 16

Speaker 17

Speaker 18

Speaker 19

Speaker 20

Speaker 21

Speaker 22

Speaker 23

Speaker 24

Speaker 25

Speaker 26

Speaker 27

Speaker 28

Speaker 29

Speaker 30

Speaker 31

Speaker 32

Speaker 33

Speaker 34

Speaker 35

Speaker 36

Speaker 37

Speaker 38

Speaker 39

Speaker 40

Speaker 41

Speaker 42

Speaker 43

Speaker 44

Speaker 45

Speaker 46

Speaker 47

Speaker 48

Speaker 49

Speaker 50

Speaker 51

Speaker 52

Speaker 53

Speaker 54

Speaker 55

Speaker 56

Speaker 57

Speaker 58

Speaker 59

Speaker 60

Speaker 61

Speaker 62

Speaker 63

Speaker 64

Speaker 65

Speaker 66

Speaker 67

Speaker 68

Speaker 69

Speaker 70

Speaker 71

Speaker 72

Speaker 73

Speaker 74

Speaker 75

Speaker 76

Speaker 77

Speaker 78

Speaker 79

Speaker 80

Speaker 81

Speaker 82

Speaker 83

Speaker 84

Speaker 85

Speaker 86

Speaker 87

Speaker 88

Speaker 89

Speaker 90

Speaker 91

Speaker 92

Speaker 93

Speaker 94

Speaker 95

Speaker 96

Speaker 97

Speaker 98

Speaker 99

Speaker 100

Speaker 101

Speaker 102

Speaker 103

Speaker 104

Speaker 105

Speaker 106

Speaker 107

Speaker 108

Speaker 109

Speaker 110

Speaker 111

Speaker 112

Speaker 113

Speaker 114

Speaker 115

Speaker 116

Speaker 117

Speaker 118

Speaker 119

Speaker 120

Speaker 121

Speaker 122

Speaker 123

Speaker 124

Speaker 125

Speaker 126

Speaker 127

Speaker 128

Speaker 129

Speaker 130

Speaker 131

Speaker 132

Speaker 133

Speaker 134

Speaker 135

Speaker 136

Speaker 137

Speaker 138

Speaker 139

Speaker 140

Speaker 141

Speaker 142

Speaker 143

Speaker 144

Speaker 145

Speaker 146

Speaker 147

Speaker 148

Speaker 149

Speaker 150

Speaker 151

Speaker 152

Speaker 153

Speaker 154

Speaker 155

Speaker 156

Speaker 157

Speaker 158

Speaker 159

Speaker 160

Speaker 161

Speaker 162

Speaker 163

Speaker 164

Speaker 165

Speaker 166

Speaker 167

Speaker 168

Speaker 169

Speaker 170

Speaker 171

Speaker 172

Speaker 173

Speaker 174

Speaker 175

Speaker 176

Speaker 177

Speaker 178

Speaker 179

Speaker 180

Speaker 181

Speaker 182

Speaker 183

Speaker 184

Speaker 185

Speaker 186

Speaker 187

Speaker 188

Speaker 189

Speaker 190

Speaker 191

Speaker 192

Speaker 193

Speaker 194

Speaker 195

Speaker 196

Speaker 197

Speaker 198

Speaker 199

Speaker 200

Speaker 201

Speaker 202

Speaker 203

Speaker 204

Speaker 205

Speaker 206

Speaker 207

Speaker 208

Speaker 209

Speaker 210

Speaker 211

Speaker 212

Speaker 213

Speaker 214

Speaker 215

Speaker 216

Speaker 217

Speaker 218

Speaker 219

Speaker 220

Speaker 221

Speaker 222

Speaker 223

Speaker 224

Speaker 225

Speaker 226

Speaker 227

Speaker 228

Speaker 229

Speaker 230

Speaker 231

Speaker 232

Speaker 233

Speaker 234

Speaker 235

Speaker 236

Speaker 237

Speaker 238

Speaker 239

Speaker 240

Speaker 241

Speaker 242

Speaker 243

Speaker 244

Speaker 245

Speaker 246

Speaker 247

Speaker 248

Speaker 249

Speaker 250

Speaker 251

Speaker 252

Speaker 253

Speaker 254

Speaker 255

Speaker 256

Speaker 257

Speaker 258

Speaker 259

Speaker 260

Speaker 261

Speaker 262

Speaker 263

Speaker 264

Speaker 265

Speaker 266

Speaker 267

Speaker 268

Speaker 269

Speaker 270

Speaker 271

Speaker 272

Speaker 273

Speaker 274

Speaker 275

Speaker 276

Speaker 277

Speaker 278

Speaker 279

Speaker 280

Speaker 281

Speaker 282

Speaker 283

Speaker 284

Speaker 285

Speaker 286

Speaker 287

Speaker 288

Speaker 289

Speaker 290

Speaker 291

Speaker 292

Speaker 293

Speaker 294

Speaker 295

Speaker 296

Speaker 297

Speaker 298

Speaker 299

Speaker 300

Speaker 301

Speaker 302

Speaker 303

Speaker 304

Speaker 305

Speaker 306

Speaker 307

Speaker 308

Speaker 309

Speaker 310

Speaker 311

Speaker 312

Speaker 313

Speaker 314

Speaker 315

Speaker 316

Speaker 317

Speaker 318

Speaker 319

Speaker 320

Speaker 321

Speaker 322

Speaker 323

Speaker 324

Speaker 325

Speaker 326

Speaker 327

Speaker 328

Speaker 329

Speaker 330

Speaker 331

Speaker 332

Speaker 333

Speaker 334

Speaker 335

Speaker 336

Speaker 337

Speaker 338

Speaker 339

Speaker 340

Speaker 341

Speaker 342

Speaker 343

Speaker 344

Speaker 345

Speaker 346

Speaker 347

Speaker 348

Speaker 349

Speaker 350

Speaker 351

Speaker 352

Speaker 353

Speaker 354

Speaker 355

Speaker 356

Speaker 357

Speaker 358

Speaker 359

Speaker 360

Speaker 361

Speaker 362

Speaker 363

Speaker 364

Speaker 365

Speaker 366

Speaker 367

Speaker 368

Speaker 369

Speaker 370

Speaker 371

Speaker 372

Speaker 373

Speaker 374

Speaker 375

Speaker 376

Speaker 377

Speaker 378

Speaker 379

Speaker 380

Speaker 381

Speaker 382

Speaker 383

Speaker 384

Speaker 385

Speaker 386

Speaker 387

Speaker 388

Speaker 389

Speaker 390

Speaker 391

Speaker 392

Speaker 393

Speaker 394

Speaker 395

Speaker 396

Speaker 397

Speaker 398

Speaker 399

Speaker 400

Speaker 401

Speaker 402

Speaker 403

Speaker 404

Speaker 405

Speaker 406

Speaker 407

Speaker 408

Speaker 409

Speaker 410

Speaker 411

Speaker 412

Speaker 413

Speaker 414

Speaker 415

Speaker 416

Speaker 417

Speaker 418

Speaker 419

Speaker 420

Speaker 421

Speaker 422

Speaker 423

Speaker 424

Speaker 425

Speaker 426

Speaker 427

Speaker 428

Speaker 429

Speaker 430

Speaker 431

Speaker 432

Speaker 433

Speaker 434

Speaker 435

Speaker 436

Speaker 437

Speaker 438

Speaker 439

Speaker 440

Speaker 441

Speaker 442

Speaker 443

Speaker 444

Speaker 445

Speaker 446

Speaker 447

Speaker 448

Speaker 449

Speaker 450

Speaker 451

Speaker 452

Speaker 453

Speaker 454

Speaker 455

Speaker 456

Speaker 457

Speaker 458

Speaker 459

Speaker 460

Speaker 461

Speaker 462

Speaker 463

Speaker 464

Speaker 465

Speaker 466

Speaker 467

Speaker 468

Speaker 469

Speaker 470

Speaker 471

Speaker 472

Speaker 473

Speaker 474

Speaker 475

Speaker 476

Speaker 477

Speaker 478

Speaker 479

Speaker 480

Speaker 481

Speaker 482

Speaker 483

Speaker 484

Speaker 485

Speaker 486

Speaker 487

Speaker 488

Speaker 489

Speaker 490

Speaker 491

Speaker 492

Speaker 493

Speaker 494

Speaker 495

Speaker 496

Speaker 497

Speaker 498

Speaker 499

Speaker 500

Speaker 501

Speaker 502

Speaker 503

Speaker 504

Speaker 505

Speaker 506

Speaker 507

Speaker 508

Speaker 509

Speaker 510

Speaker 511

Speaker 512

Speaker 513

Speaker 514

Speaker 515

Speaker 516

Speaker 517

Speaker 518

Speaker 519

Speaker 520

Speaker 521

Speaker 522

Speaker 523

Speaker 524

Speaker 525

Speaker 526

Speaker 527

Speaker 528

Speaker 529

Speaker 530

Speaker 531

Speaker 532

Speaker 533

Speaker 534

Speaker 535

Speaker 536

Speaker 537

Speaker 538

Speaker 539

Speaker 540

Speaker 541

Speaker 542

Speaker 543

Speaker 544

Speaker 545

Speaker 546

Speaker 547

Speaker 548

Speaker 549

Speaker 550

Speaker 551

Speaker 552

Speaker 553

Speaker 554

Speaker 555

Speaker 556

Speaker 557

Speaker 558

Speaker 559

Speaker 560

Speaker 561

Speaker 562

Speaker 563

Speaker 564

Speaker 565

Speaker 566

Speaker 567

Speaker 568

Speaker 569

Speaker 570

Speaker 571

Speaker 572

Speaker 573

Speaker 574

Speaker 575

Speaker 576

Speaker 577

Speaker 578

Speaker 579

Speaker 580

Speaker 581

Speaker 582

Speaker 583

Speaker 584

Speaker 585

Speaker 586

Speaker 587

Speaker 588

Speaker 589

Speaker 590

Speaker 591

Speaker 592

Speaker 593

Speaker 594

Speaker 595

Speaker 596

Speaker 597

Speaker 598

Speaker 599

Speaker 600

Speaker 601

Speaker 602

Speaker 603

Speaker 604

Speaker 605

Speaker 606

Speaker 607

Speaker 608

Speaker 609

Speaker 610

Speaker 611

Speaker 612

Speaker 613

Speaker 614

Speaker 615

Speaker 616

Speaker 617

Speaker 618

Speaker 619

Speaker 620

Speaker 621

Speaker 622

Speaker 623

Speaker 624

Speaker 625

Speaker 626

Speaker 627

Speaker 628

Speaker 629

Speaker 630

Speaker 631

Speaker 632

Speaker 633

Speaker 634

Speaker 635

Speaker 636

Speaker 637

Speaker 638

Speaker 639

Speaker 640

Speaker 641

Speaker 642

Speaker 643

Speaker 644

Speaker 645

Speaker 646

Speaker 647

Speaker 648

Speaker 649

Speaker 650

Speaker 651

Speaker 652

Speaker 653

Speaker 654

Speaker 655

Speaker 656

Speaker 657

Speaker 658

Speaker 659

Speaker 660

Speaker 661

Speaker 662

Speaker 663

Speaker 664

Speaker 665

Speaker 666

Speaker 667

Speaker 668

Speaker 669

Speaker 670

Speaker 671

Speaker 672

Speaker 673

Speaker 674

Speaker 675

Speaker 676

Speaker 677

Speaker 678

Speaker 679

Speaker 680

Speaker 681

Speaker 682

Speaker 683

Speaker 684

Speaker 685

Speaker 686

Speaker 687

Speaker 688

Speaker 689

Speaker 690

Speaker 691

Speaker 692

Speaker 693

Speaker 694

Speaker 695

Speaker 696

Speaker 697

Speaker 698

Speaker 699

Speaker 700

Speaker 701

Speaker 702

Speaker 703

Speaker 704

Speaker 705

Speaker 706

Speaker 707

Speaker 708

Speaker 709

Speaker 710

Speaker 711

Speaker 712

Speaker 713

Speaker 714

Speaker 715

Speaker 716

Speaker 717

Speaker 718

Speaker 719

Speaker 720

Speaker 721

Speaker 722

Speaker 723

Speaker 724

Speaker 725

Speaker 726

Speaker 727

Speaker 728

Speaker 729

Speaker 730

Speaker 731

Speaker 732

Speaker 733

Speaker 734

Speaker 735

Speaker 736

Speaker 737

Speaker 738

Speaker 739

Speaker 740

Speaker 741

Speaker 742

Speaker 743

Speaker 744

Speaker 745

Speaker 746

Speaker 747

Speaker 748

Speaker 749

Speaker 750

Speaker 751

Speaker 752

Speaker 753

Speaker 754

Speaker 755

Speaker 756

Speaker 757

Speaker 758

Speaker 759

Speaker 760

Speaker 761

Speaker 762

Speaker 763

Speaker 764

Speaker 765

Speaker 766

Speaker 767

Speaker 768

Speaker 769

Speaker 770

Speaker 771

Speaker 772

Speaker 773

Speaker 774

Speaker 775

Speaker 776

Speaker 777

Speaker 778

Speaker 779

Speaker 780

Speaker 781

Speaker 782

Speaker 783

Speaker 784

Speaker 785

Speaker 786

Speaker 787

Speaker 788

Speaker 789

Speaker 790

Speaker 791

Speaker 792

Speaker 793

Speaker 794

Speaker 795

Speaker 796

Speaker 797

Speaker 798

Speaker 799

Speaker 800

Speaker 801

Speaker 802

Speaker 803

Speaker 804

Speaker 805

Speaker 806

Speaker 807

Speaker 808

Speaker 809

Speaker 810

Speaker 811

Speaker 812

Speaker 813

Speaker 814

Speaker 815

Speaker 816

Speaker 817

Speaker 818

Speaker 819

Speaker 820

Speaker 821

Speaker 822

Speaker 823

Speaker 824

Speaker 825

Speaker 826

Speaker 827

Speaker 828

Speaker 829

Speaker 830

Speaker 831

Speaker 832

Speaker 833

Speaker 834

Speaker 835

Speaker 836

Speaker 837

Speaker 838

Speaker 839

Speaker 840

Speaker 841

Speaker 842

Speaker 843

Speaker 844

Speaker 845

Speaker 846

Speaker 847

Speaker 848

Speaker 849

Speaker 850

Speaker 851

Speaker 852

Speaker 853

Speaker 854

Speaker 855

Speaker 856

Speaker 857

Speaker 858

Speaker 859

Speaker 860

Speaker 861

Speaker 862

Speaker 863

Speaker 864

Speaker 865

Speaker 866

Speaker 867

Speaker 868

Speaker 869

Speaker 870

Speaker 871

Speaker 872

Speaker 873

Speaker 874

Speaker 875

Speaker 876

Speaker 877

Speaker 878

Speaker 879

Speaker 880

Speaker 881

Speaker 882

Speaker 883

Speaker 884

Speaker 885

Speaker 886

Speaker 887

Speaker 888

Speaker 889

Speaker 890

Speaker 891

Speaker 892

Speaker 893

Speaker 894

Speaker 895

Speaker 896

Speaker 897

Speaker 898

Speaker 899

Speaker 900

Speaker 901

Speaker 902

Speaker 903

vits-piper-en_US-ljspeech-high

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/ljspeech/high

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-ljspeech-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-ljspeech-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx";
  config.model.vits.tokens = "vits-piper-en_US-ljspeech-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-ljspeech-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-ljspeech-high.tar.bz2

You can use the following code to play with vits-piper-en_US-ljspeech-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx",
            data_dir="vits-piper-en_US-ljspeech-high/espeak-ng-data",
            tokens="vits-piper-en_US-ljspeech-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-ljspeech-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx";
  config.model.vits.tokens = "vits-piper-en_US-ljspeech-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-ljspeech-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-ljspeech-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx".into()),
                tokens: Some("vits-piper-en_US-ljspeech-high/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-ljspeech-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-ljspeech-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx',
        tokens: 'vits-piper-en_US-ljspeech-high/tokens.txt',
        dataDir: 'vits-piper-en_US-ljspeech-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-ljspeech-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx',
    tokens: 'vits-piper-en_US-ljspeech-high/tokens.txt',
    dataDir: 'vits-piper-en_US-ljspeech-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-ljspeech-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-ljspeech-high/tokens.txt",
    dataDir: "vits-piper-en_US-ljspeech-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-ljspeech-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-ljspeech-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-ljspeech-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-ljspeech-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx",
        tokens = "vits-piper-en_US-ljspeech-high/tokens.txt",
        dataDir = "vits-piper-en_US-ljspeech-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-ljspeech-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx");
    vits.setTokens("vits-piper-en_US-ljspeech-high/tokens.txt");
    vits.setDataDir("vits-piper-en_US-ljspeech-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-ljspeech-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-ljspeech-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-ljspeech-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-ljspeech-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx",
				Tokens:  "vits-piper-en_US-ljspeech-high/tokens.txt",
				DataDir: "vits-piper-en_US-ljspeech-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-ljspeech-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/ljspeech/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-ljspeech-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-ljspeech-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-ljspeech-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-ljspeech-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-ljspeech-medium.tar.bz2

You can use the following code to play with vits-piper-en_US-ljspeech-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx",
            data_dir="vits-piper-en_US-ljspeech-medium/espeak-ng-data",
            tokens="vits-piper-en_US-ljspeech-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-ljspeech-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-ljspeech-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-ljspeech-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-ljspeech-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx".into()),
                tokens: Some("vits-piper-en_US-ljspeech-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-ljspeech-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-ljspeech-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx',
        tokens: 'vits-piper-en_US-ljspeech-medium/tokens.txt',
        dataDir: 'vits-piper-en_US-ljspeech-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-ljspeech-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx',
    tokens: 'vits-piper-en_US-ljspeech-medium/tokens.txt',
    dataDir: 'vits-piper-en_US-ljspeech-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-ljspeech-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-ljspeech-medium/tokens.txt",
    dataDir: "vits-piper-en_US-ljspeech-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-ljspeech-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-ljspeech-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-ljspeech-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-ljspeech-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx",
        tokens = "vits-piper-en_US-ljspeech-medium/tokens.txt",
        dataDir = "vits-piper-en_US-ljspeech-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-ljspeech-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx");
    vits.setTokens("vits-piper-en_US-ljspeech-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_US-ljspeech-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-ljspeech-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-ljspeech-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-ljspeech-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-ljspeech-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx",
				Tokens:  "vits-piper-en_US-ljspeech-medium/tokens.txt",
				DataDir: "vits-piper-en_US-ljspeech-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-miro-high

Info about this model

This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_en-US_miro

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-miro-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-miro-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-miro-high/en_US-miro-high.onnx";
  config.model.vits.tokens = "vits-piper-en_US-miro-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-miro-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-miro-high.tar.bz2

You can use the following code to play with vits-piper-en_US-miro-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-miro-high/en_US-miro-high.onnx",
            data_dir="vits-piper-en_US-miro-high/espeak-ng-data",
            tokens="vits-piper-en_US-miro-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-miro-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-miro-high/en_US-miro-high.onnx";
  config.model.vits.tokens = "vits-piper-en_US-miro-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-miro-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-miro-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-miro-high/en_US-miro-high.onnx".into()),
                tokens: Some("vits-piper-en_US-miro-high/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-miro-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-miro-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-miro-high/en_US-miro-high.onnx',
        tokens: 'vits-piper-en_US-miro-high/tokens.txt',
        dataDir: 'vits-piper-en_US-miro-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-miro-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-miro-high/en_US-miro-high.onnx',
    tokens: 'vits-piper-en_US-miro-high/tokens.txt',
    dataDir: 'vits-piper-en_US-miro-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-miro-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-miro-high/en_US-miro-high.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-miro-high/tokens.txt",
    dataDir: "vits-piper-en_US-miro-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-miro-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-miro-high/en_US-miro-high.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-miro-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-miro-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-miro-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-miro-high/en_US-miro-high.onnx",
        tokens = "vits-piper-en_US-miro-high/tokens.txt",
        dataDir = "vits-piper-en_US-miro-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-miro-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-miro-high/en_US-miro-high.onnx");
    vits.setTokens("vits-piper-en_US-miro-high/tokens.txt");
    vits.setDataDir("vits-piper-en_US-miro-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-miro-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-miro-high/en_US-miro-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-miro-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-miro-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-miro-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-miro-high/en_US-miro-high.onnx",
				Tokens:  "vits-piper-en_US-miro-high/tokens.txt",
				DataDir: "vits-piper-en_US-miro-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-norman-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/norman/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-norman-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-norman-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-norman-medium/en_US-norman-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-norman-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-norman-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-norman-medium.tar.bz2

You can use the following code to play with vits-piper-en_US-norman-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-norman-medium/en_US-norman-medium.onnx",
            data_dir="vits-piper-en_US-norman-medium/espeak-ng-data",
            tokens="vits-piper-en_US-norman-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-norman-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-norman-medium/en_US-norman-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-norman-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-norman-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-norman-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-norman-medium/en_US-norman-medium.onnx".into()),
                tokens: Some("vits-piper-en_US-norman-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-norman-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-norman-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-norman-medium/en_US-norman-medium.onnx',
        tokens: 'vits-piper-en_US-norman-medium/tokens.txt',
        dataDir: 'vits-piper-en_US-norman-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-norman-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-norman-medium/en_US-norman-medium.onnx',
    tokens: 'vits-piper-en_US-norman-medium/tokens.txt',
    dataDir: 'vits-piper-en_US-norman-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-norman-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-norman-medium/en_US-norman-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-norman-medium/tokens.txt",
    dataDir: "vits-piper-en_US-norman-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-norman-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-norman-medium/en_US-norman-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-norman-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-norman-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-norman-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-norman-medium/en_US-norman-medium.onnx",
        tokens = "vits-piper-en_US-norman-medium/tokens.txt",
        dataDir = "vits-piper-en_US-norman-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-norman-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-norman-medium/en_US-norman-medium.onnx");
    vits.setTokens("vits-piper-en_US-norman-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_US-norman-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-norman-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-norman-medium/en_US-norman-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-norman-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-norman-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-norman-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-norman-medium/en_US-norman-medium.onnx",
				Tokens:  "vits-piper-en_US-norman-medium/tokens.txt",
				DataDir: "vits-piper-en_US-norman-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-reza_ibrahim-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/reza_ibrahim/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-reza_ibrahim-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-reza_ibrahim-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-reza_ibrahim-medium.tar.bz2

You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx",
            data_dir="vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data",
            tokens="vits-piper-en_US-reza_ibrahim-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-reza_ibrahim-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx".into()),
                tokens: Some("vits-piper-en_US-reza_ibrahim-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx',
        tokens: 'vits-piper-en_US-reza_ibrahim-medium/tokens.txt',
        dataDir: 'vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx',
    tokens: 'vits-piper-en_US-reza_ibrahim-medium/tokens.txt',
    dataDir: 'vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-reza_ibrahim-medium/tokens.txt",
    dataDir: "vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-reza_ibrahim-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx",
        tokens = "vits-piper-en_US-reza_ibrahim-medium/tokens.txt",
        dataDir = "vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx");
    vits.setTokens("vits-piper-en_US-reza_ibrahim-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-reza_ibrahim-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx",
				Tokens:  "vits-piper-en_US-reza_ibrahim-medium/tokens.txt",
				DataDir: "vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-ryan-high

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/ryan/high

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-ryan-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-ryan-high/en_US-ryan-high.onnx";
  config.model.vits.tokens = "vits-piper-en_US-ryan-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-ryan-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-ryan-high.tar.bz2

You can use the following code to play with vits-piper-en_US-ryan-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-ryan-high/en_US-ryan-high.onnx",
            data_dir="vits-piper-en_US-ryan-high/espeak-ng-data",
            tokens="vits-piper-en_US-ryan-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-ryan-high/en_US-ryan-high.onnx";
  config.model.vits.tokens = "vits-piper-en_US-ryan-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-ryan-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-ryan-high/en_US-ryan-high.onnx".into()),
                tokens: Some("vits-piper-en_US-ryan-high/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-ryan-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-ryan-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-ryan-high/en_US-ryan-high.onnx',
        tokens: 'vits-piper-en_US-ryan-high/tokens.txt',
        dataDir: 'vits-piper-en_US-ryan-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-ryan-high/en_US-ryan-high.onnx',
    tokens: 'vits-piper-en_US-ryan-high/tokens.txt',
    dataDir: 'vits-piper-en_US-ryan-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-ryan-high/en_US-ryan-high.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-ryan-high/tokens.txt",
    dataDir: "vits-piper-en_US-ryan-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-ryan-high/en_US-ryan-high.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-ryan-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-ryan-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-ryan-high/en_US-ryan-high.onnx",
        tokens = "vits-piper-en_US-ryan-high/tokens.txt",
        dataDir = "vits-piper-en_US-ryan-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-ryan-high/en_US-ryan-high.onnx");
    vits.setTokens("vits-piper-en_US-ryan-high/tokens.txt");
    vits.setDataDir("vits-piper-en_US-ryan-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-ryan-high/en_US-ryan-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-ryan-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-ryan-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-ryan-high/en_US-ryan-high.onnx",
				Tokens:  "vits-piper-en_US-ryan-high/tokens.txt",
				DataDir: "vits-piper-en_US-ryan-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-ryan-low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/ryan/low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-ryan-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-ryan-low/en_US-ryan-low.onnx";
  config.model.vits.tokens = "vits-piper-en_US-ryan-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-ryan-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-ryan-low.tar.bz2

You can use the following code to play with vits-piper-en_US-ryan-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-ryan-low/en_US-ryan-low.onnx",
            data_dir="vits-piper-en_US-ryan-low/espeak-ng-data",
            tokens="vits-piper-en_US-ryan-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-ryan-low/en_US-ryan-low.onnx";
  config.model.vits.tokens = "vits-piper-en_US-ryan-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-ryan-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-ryan-low/en_US-ryan-low.onnx".into()),
                tokens: Some("vits-piper-en_US-ryan-low/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-ryan-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-ryan-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-ryan-low/en_US-ryan-low.onnx',
        tokens: 'vits-piper-en_US-ryan-low/tokens.txt',
        dataDir: 'vits-piper-en_US-ryan-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-ryan-low/en_US-ryan-low.onnx',
    tokens: 'vits-piper-en_US-ryan-low/tokens.txt',
    dataDir: 'vits-piper-en_US-ryan-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-ryan-low/en_US-ryan-low.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-ryan-low/tokens.txt",
    dataDir: "vits-piper-en_US-ryan-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-ryan-low/en_US-ryan-low.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-ryan-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-ryan-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-ryan-low/en_US-ryan-low.onnx",
        tokens = "vits-piper-en_US-ryan-low/tokens.txt",
        dataDir = "vits-piper-en_US-ryan-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-ryan-low/en_US-ryan-low.onnx");
    vits.setTokens("vits-piper-en_US-ryan-low/tokens.txt");
    vits.setDataDir("vits-piper-en_US-ryan-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-ryan-low/en_US-ryan-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-ryan-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-ryan-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-ryan-low/en_US-ryan-low.onnx",
				Tokens:  "vits-piper-en_US-ryan-low/tokens.txt",
				DataDir: "vits-piper-en_US-ryan-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-ryan-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/ryan/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-ryan-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-ryan-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-ryan-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-ryan-medium.tar.bz2

You can use the following code to play with vits-piper-en_US-ryan-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx",
            data_dir="vits-piper-en_US-ryan-medium/espeak-ng-data",
            tokens="vits-piper-en_US-ryan-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-ryan-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-ryan-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx".into()),
                tokens: Some("vits-piper-en_US-ryan-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-ryan-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-ryan-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx',
        tokens: 'vits-piper-en_US-ryan-medium/tokens.txt',
        dataDir: 'vits-piper-en_US-ryan-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx',
    tokens: 'vits-piper-en_US-ryan-medium/tokens.txt',
    dataDir: 'vits-piper-en_US-ryan-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-ryan-medium/tokens.txt",
    dataDir: "vits-piper-en_US-ryan-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-ryan-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-ryan-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx",
        tokens = "vits-piper-en_US-ryan-medium/tokens.txt",
        dataDir = "vits-piper-en_US-ryan-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx");
    vits.setTokens("vits-piper-en_US-ryan-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_US-ryan-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-ryan-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-ryan-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-ryan-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx",
				Tokens:  "vits-piper-en_US-ryan-medium/tokens.txt",
				DataDir: "vits-piper-en_US-ryan-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-en_US-sam-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/sam/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-sam-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-en_US-sam-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-en_US-sam-medium/en_US-sam-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-sam-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-sam-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-sam-medium.tar.bz2

You can use the following code to play with vits-piper-en_US-sam-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-en_US-sam-medium/en_US-sam-medium.onnx",
            data_dir="vits-piper-en_US-sam-medium/espeak-ng-data",
            tokens="vits-piper-en_US-sam-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-en_US-sam-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-en_US-sam-medium/en_US-sam-medium.onnx";
  config.model.vits.tokens = "vits-piper-en_US-sam-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-en_US-sam-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-en_US-sam-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-en_US-sam-medium/en_US-sam-medium.onnx".into()),
                tokens: Some("vits-piper-en_US-sam-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-en_US-sam-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-en_US-sam-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-en_US-sam-medium/en_US-sam-medium.onnx',
        tokens: 'vits-piper-en_US-sam-medium/tokens.txt',
        dataDir: 'vits-piper-en_US-sam-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-en_US-sam-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-en_US-sam-medium/en_US-sam-medium.onnx',
    tokens: 'vits-piper-en_US-sam-medium/tokens.txt',
    dataDir: 'vits-piper-en_US-sam-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-en_US-sam-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-en_US-sam-medium/en_US-sam-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-en_US-sam-medium/tokens.txt",
    dataDir: "vits-piper-en_US-sam-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-en_US-sam-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-sam-medium/en_US-sam-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-sam-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-sam-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-en_US-sam-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-en_US-sam-medium/en_US-sam-medium.onnx",
        tokens = "vits-piper-en_US-sam-medium/tokens.txt",
        dataDir = "vits-piper-en_US-sam-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-en_US-sam-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-en_US-sam-medium/en_US-sam-medium.onnx");
    vits.setTokens("vits-piper-en_US-sam-medium/tokens.txt");
    vits.setDataDir("vits-piper-en_US-sam-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-en_US-sam-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-en_US-sam-medium/en_US-sam-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-en_US-sam-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-en_US-sam-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-en_US-sam-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-en_US-sam-medium/en_US-sam-medium.onnx",
				Tokens:  "vits-piper-en_US-sam-medium/tokens.txt",
				DataDir: "vits-piper-en_US-sam-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0

Estonian

This section lists text to speech models for Estonian.

supertonic-3-et

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Estonian (et).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "et"

audio = tts.generate("See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"et\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "et"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "et"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'et'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'et'},
  );
  final audio = tts.generateWithConfig(text: 'See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "et"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"et\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "et"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"et\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "et"}';

  Audio := Tts.GenerateWithConfig('See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "et"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Tere maailm.

1

Kuidas sul täna läheb?

2

Taevas on sinine ja tuul on vaikne.

3

Masinõpe aitab arvutitel andmetest õppida.

4

Kõnesüntees muudab teksti selgeks heliks.

5

Õpilased lugesid raamatukogus lühikest lugu.

6

Rong hilines rööbaste hoolduse tõttu.

7

Väikesed mudelid töötavad kiiresti kohalikes seadmetes.

8

Häälassistent aitab igapäevaste ülesannetega.

9

Stabiilne lugemine on tähtis nii lühikeste kui pikkade lausete jaoks.

Speaker 1

0

Tere maailm.

1

Kuidas sul täna läheb?

2

Taevas on sinine ja tuul on vaikne.

3

Masinõpe aitab arvutitel andmetest õppida.

4

Kõnesüntees muudab teksti selgeks heliks.

5

Õpilased lugesid raamatukogus lühikest lugu.

6

Rong hilines rööbaste hoolduse tõttu.

7

Väikesed mudelid töötavad kiiresti kohalikes seadmetes.

8

Häälassistent aitab igapäevaste ülesannetega.

9

Stabiilne lugemine on tähtis nii lühikeste kui pikkade lausete jaoks.

Speaker 2

0

Tere maailm.

1

Kuidas sul täna läheb?

2

Taevas on sinine ja tuul on vaikne.

3

Masinõpe aitab arvutitel andmetest õppida.

4

Kõnesüntees muudab teksti selgeks heliks.

5

Õpilased lugesid raamatukogus lühikest lugu.

6

Rong hilines rööbaste hoolduse tõttu.

7

Väikesed mudelid töötavad kiiresti kohalikes seadmetes.

8

Häälassistent aitab igapäevaste ülesannetega.

9

Stabiilne lugemine on tähtis nii lühikeste kui pikkade lausete jaoks.

Speaker 3

0

Tere maailm.

1

Kuidas sul täna läheb?

2

Taevas on sinine ja tuul on vaikne.

3

Masinõpe aitab arvutitel andmetest õppida.

4

Kõnesüntees muudab teksti selgeks heliks.

5

Õpilased lugesid raamatukogus lühikest lugu.

6

Rong hilines rööbaste hoolduse tõttu.

7

Väikesed mudelid töötavad kiiresti kohalikes seadmetes.

8

Häälassistent aitab igapäevaste ülesannetega.

9

Stabiilne lugemine on tähtis nii lühikeste kui pikkade lausete jaoks.

Speaker 4

0

Tere maailm.

1

Kuidas sul täna läheb?

2

Taevas on sinine ja tuul on vaikne.

3

Masinõpe aitab arvutitel andmetest õppida.

4

Kõnesüntees muudab teksti selgeks heliks.

5

Õpilased lugesid raamatukogus lühikest lugu.

6

Rong hilines rööbaste hoolduse tõttu.

7

Väikesed mudelid töötavad kiiresti kohalikes seadmetes.

8

Häälassistent aitab igapäevaste ülesannetega.

9

Stabiilne lugemine on tähtis nii lühikeste kui pikkade lausete jaoks.

Speaker 5

0

Tere maailm.

1

Kuidas sul täna läheb?

2

Taevas on sinine ja tuul on vaikne.

3

Masinõpe aitab arvutitel andmetest õppida.

4

Kõnesüntees muudab teksti selgeks heliks.

5

Õpilased lugesid raamatukogus lühikest lugu.

6

Rong hilines rööbaste hoolduse tõttu.

7

Väikesed mudelid töötavad kiiresti kohalikes seadmetes.

8

Häälassistent aitab igapäevaste ülesannetega.

9

Stabiilne lugemine on tähtis nii lühikeste kui pikkade lausete jaoks.

Speaker 6

0

Tere maailm.

1

Kuidas sul täna läheb?

2

Taevas on sinine ja tuul on vaikne.

3

Masinõpe aitab arvutitel andmetest õppida.

4

Kõnesüntees muudab teksti selgeks heliks.

5

Õpilased lugesid raamatukogus lühikest lugu.

6

Rong hilines rööbaste hoolduse tõttu.

7

Väikesed mudelid töötavad kiiresti kohalikes seadmetes.

8

Häälassistent aitab igapäevaste ülesannetega.

9

Stabiilne lugemine on tähtis nii lühikeste kui pikkade lausete jaoks.

Speaker 7

0

Tere maailm.

1

Kuidas sul täna läheb?

2

Taevas on sinine ja tuul on vaikne.

3

Masinõpe aitab arvutitel andmetest õppida.

4

Kõnesüntees muudab teksti selgeks heliks.

5

Õpilased lugesid raamatukogus lühikest lugu.

6

Rong hilines rööbaste hoolduse tõttu.

7

Väikesed mudelid töötavad kiiresti kohalikes seadmetes.

8

Häälassistent aitab igapäevaste ülesannetega.

9

Stabiilne lugemine on tähtis nii lühikeste kui pikkade lausete jaoks.

Speaker 8

0

Tere maailm.

1

Kuidas sul täna läheb?

2

Taevas on sinine ja tuul on vaikne.

3

Masinõpe aitab arvutitel andmetest õppida.

4

Kõnesüntees muudab teksti selgeks heliks.

5

Õpilased lugesid raamatukogus lühikest lugu.

6

Rong hilines rööbaste hoolduse tõttu.

7

Väikesed mudelid töötavad kiiresti kohalikes seadmetes.

8

Häälassistent aitab igapäevaste ülesannetega.

9

Stabiilne lugemine on tähtis nii lühikeste kui pikkade lausete jaoks.

Speaker 9

0

Tere maailm.

1

Kuidas sul täna läheb?

2

Taevas on sinine ja tuul on vaikne.

3

Masinõpe aitab arvutitel andmetest õppida.

4

Kõnesüntees muudab teksti selgeks heliks.

5

Õpilased lugesid raamatukogus lühikest lugu.

6

Rong hilines rööbaste hoolduse tõttu.

7

Väikesed mudelid töötavad kiiresti kohalikes seadmetes.

8

Häälassistent aitab igapäevaste ülesannetega.

9

Stabiilne lugemine on tähtis nii lühikeste kui pikkade lausete jaoks.

Finnish

This section lists text to speech models for Finnish.

vits-piper-fi_FI-harri-low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fi/fi_FI/harri/low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fi_FI-harri-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-fi_FI-harri-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx";
  config.model.vits.tokens = "vits-piper-fi_FI-harri-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fi_FI-harri-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fi_FI-harri-low.tar.bz2

You can use the following code to play with vits-piper-fi_FI-harri-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx",
            data_dir="vits-piper-fi_FI-harri-low/espeak-ng-data",
            tokens="vits-piper-fi_FI-harri-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-fi_FI-harri-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx";
  config.model.vits.tokens = "vits-piper-fi_FI-harri-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fi_FI-harri-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-fi_FI-harri-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx".into()),
                tokens: Some("vits-piper-fi_FI-harri-low/tokens.txt".into()),
                data_dir: Some("vits-piper-fi_FI-harri-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-fi_FI-harri-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx',
        tokens: 'vits-piper-fi_FI-harri-low/tokens.txt',
        dataDir: 'vits-piper-fi_FI-harri-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-fi_FI-harri-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx',
    tokens: 'vits-piper-fi_FI-harri-low/tokens.txt',
    dataDir: 'vits-piper-fi_FI-harri-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-fi_FI-harri-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx",
    lexicon: "",
    tokens: "vits-piper-fi_FI-harri-low/tokens.txt",
    dataDir: "vits-piper-fi_FI-harri-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-fi_FI-harri-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx";
config.Model.Vits.Tokens = "vits-piper-fi_FI-harri-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fi_FI-harri-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-fi_FI-harri-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx",
        tokens = "vits-piper-fi_FI-harri-low/tokens.txt",
        dataDir = "vits-piper-fi_FI-harri-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-fi_FI-harri-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx");
    vits.setTokens("vits-piper-fi_FI-harri-low/tokens.txt");
    vits.setDataDir("vits-piper-fi_FI-harri-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-fi_FI-harri-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-fi_FI-harri-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-fi_FI-harri-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-fi_FI-harri-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx",
				Tokens:  "vits-piper-fi_FI-harri-low/tokens.txt",
				DataDir: "vits-piper-fi_FI-harri-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-fi_FI-harri-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fi/fi_FI/harri/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fi_FI-harri-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-fi_FI-harri-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx";
  config.model.vits.tokens = "vits-piper-fi_FI-harri-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fi_FI-harri-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fi_FI-harri-medium.tar.bz2

You can use the following code to play with vits-piper-fi_FI-harri-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx",
            data_dir="vits-piper-fi_FI-harri-medium/espeak-ng-data",
            tokens="vits-piper-fi_FI-harri-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-fi_FI-harri-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx";
  config.model.vits.tokens = "vits-piper-fi_FI-harri-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fi_FI-harri-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-fi_FI-harri-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx".into()),
                tokens: Some("vits-piper-fi_FI-harri-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-fi_FI-harri-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-fi_FI-harri-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx',
        tokens: 'vits-piper-fi_FI-harri-medium/tokens.txt',
        dataDir: 'vits-piper-fi_FI-harri-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-fi_FI-harri-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx',
    tokens: 'vits-piper-fi_FI-harri-medium/tokens.txt',
    dataDir: 'vits-piper-fi_FI-harri-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-fi_FI-harri-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-fi_FI-harri-medium/tokens.txt",
    dataDir: "vits-piper-fi_FI-harri-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-fi_FI-harri-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-fi_FI-harri-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fi_FI-harri-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-fi_FI-harri-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx",
        tokens = "vits-piper-fi_FI-harri-medium/tokens.txt",
        dataDir = "vits-piper-fi_FI-harri-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-fi_FI-harri-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx");
    vits.setTokens("vits-piper-fi_FI-harri-medium/tokens.txt");
    vits.setDataDir("vits-piper-fi_FI-harri-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-fi_FI-harri-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-fi_FI-harri-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-fi_FI-harri-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-fi_FI-harri-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx",
				Tokens:  "vits-piper-fi_FI-harri-medium/tokens.txt",
				DataDir: "vits-piper-fi_FI-harri-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.

sample audios for different speakers are listed below:

Speaker 0

supertonic-3-fi

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Finnish (fi).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "fi"

audio = tts.generate("Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"fi\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "fi"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "fi"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'fi'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'fi'},
  );
  final audio = tts.generateWithConfig(text: 'Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "fi"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"fi\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "fi"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"fi\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "fi"}';

  Audio := Tts.GenerateWithConfig('Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "fi"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Hei maailma.

1

Miten voit tänään?

2

Taivas on sininen ja tuuli on lempeä.

3

Koneoppiminen auttaa tietokoneita oppimaan datasta.

4

Puhesynteesi muuttaa tekstin selkeäksi ääneksi.

5

Oppilaat lukivat lyhyen tarinan kirjastossa.

6

Juna myöhästyi raiteiden huollon vuoksi.

7

Pienet mallit toimivat nopeasti paikallisilla laitteilla.

8

Ääniavustaja auttaa päivittäisissä tehtävissä.

9

Vakaa lukeminen on tärkeää sekä lyhyille että pitkille lauseille.

Speaker 1

0

Hei maailma.

1

Miten voit tänään?

2

Taivas on sininen ja tuuli on lempeä.

3

Koneoppiminen auttaa tietokoneita oppimaan datasta.

4

Puhesynteesi muuttaa tekstin selkeäksi ääneksi.

5

Oppilaat lukivat lyhyen tarinan kirjastossa.

6

Juna myöhästyi raiteiden huollon vuoksi.

7

Pienet mallit toimivat nopeasti paikallisilla laitteilla.

8

Ääniavustaja auttaa päivittäisissä tehtävissä.

9

Vakaa lukeminen on tärkeää sekä lyhyille että pitkille lauseille.

Speaker 2

0

Hei maailma.

1

Miten voit tänään?

2

Taivas on sininen ja tuuli on lempeä.

3

Koneoppiminen auttaa tietokoneita oppimaan datasta.

4

Puhesynteesi muuttaa tekstin selkeäksi ääneksi.

5

Oppilaat lukivat lyhyen tarinan kirjastossa.

6

Juna myöhästyi raiteiden huollon vuoksi.

7

Pienet mallit toimivat nopeasti paikallisilla laitteilla.

8

Ääniavustaja auttaa päivittäisissä tehtävissä.

9

Vakaa lukeminen on tärkeää sekä lyhyille että pitkille lauseille.

Speaker 3

0

Hei maailma.

1

Miten voit tänään?

2

Taivas on sininen ja tuuli on lempeä.

3

Koneoppiminen auttaa tietokoneita oppimaan datasta.

4

Puhesynteesi muuttaa tekstin selkeäksi ääneksi.

5

Oppilaat lukivat lyhyen tarinan kirjastossa.

6

Juna myöhästyi raiteiden huollon vuoksi.

7

Pienet mallit toimivat nopeasti paikallisilla laitteilla.

8

Ääniavustaja auttaa päivittäisissä tehtävissä.

9

Vakaa lukeminen on tärkeää sekä lyhyille että pitkille lauseille.

Speaker 4

0

Hei maailma.

1

Miten voit tänään?

2

Taivas on sininen ja tuuli on lempeä.

3

Koneoppiminen auttaa tietokoneita oppimaan datasta.

4

Puhesynteesi muuttaa tekstin selkeäksi ääneksi.

5

Oppilaat lukivat lyhyen tarinan kirjastossa.

6

Juna myöhästyi raiteiden huollon vuoksi.

7

Pienet mallit toimivat nopeasti paikallisilla laitteilla.

8

Ääniavustaja auttaa päivittäisissä tehtävissä.

9

Vakaa lukeminen on tärkeää sekä lyhyille että pitkille lauseille.

Speaker 5

0

Hei maailma.

1

Miten voit tänään?

2

Taivas on sininen ja tuuli on lempeä.

3

Koneoppiminen auttaa tietokoneita oppimaan datasta.

4

Puhesynteesi muuttaa tekstin selkeäksi ääneksi.

5

Oppilaat lukivat lyhyen tarinan kirjastossa.

6

Juna myöhästyi raiteiden huollon vuoksi.

7

Pienet mallit toimivat nopeasti paikallisilla laitteilla.

8

Ääniavustaja auttaa päivittäisissä tehtävissä.

9

Vakaa lukeminen on tärkeää sekä lyhyille että pitkille lauseille.

Speaker 6

0

Hei maailma.

1

Miten voit tänään?

2

Taivas on sininen ja tuuli on lempeä.

3

Koneoppiminen auttaa tietokoneita oppimaan datasta.

4

Puhesynteesi muuttaa tekstin selkeäksi ääneksi.

5

Oppilaat lukivat lyhyen tarinan kirjastossa.

6

Juna myöhästyi raiteiden huollon vuoksi.

7

Pienet mallit toimivat nopeasti paikallisilla laitteilla.

8

Ääniavustaja auttaa päivittäisissä tehtävissä.

9

Vakaa lukeminen on tärkeää sekä lyhyille että pitkille lauseille.

Speaker 7

0

Hei maailma.

1

Miten voit tänään?

2

Taivas on sininen ja tuuli on lempeä.

3

Koneoppiminen auttaa tietokoneita oppimaan datasta.

4

Puhesynteesi muuttaa tekstin selkeäksi ääneksi.

5

Oppilaat lukivat lyhyen tarinan kirjastossa.

6

Juna myöhästyi raiteiden huollon vuoksi.

7

Pienet mallit toimivat nopeasti paikallisilla laitteilla.

8

Ääniavustaja auttaa päivittäisissä tehtävissä.

9

Vakaa lukeminen on tärkeää sekä lyhyille että pitkille lauseille.

Speaker 8

0

Hei maailma.

1

Miten voit tänään?

2

Taivas on sininen ja tuuli on lempeä.

3

Koneoppiminen auttaa tietokoneita oppimaan datasta.

4

Puhesynteesi muuttaa tekstin selkeäksi ääneksi.

5

Oppilaat lukivat lyhyen tarinan kirjastossa.

6

Juna myöhästyi raiteiden huollon vuoksi.

7

Pienet mallit toimivat nopeasti paikallisilla laitteilla.

8

Ääniavustaja auttaa päivittäisissä tehtävissä.

9

Vakaa lukeminen on tärkeää sekä lyhyille että pitkille lauseille.

Speaker 9

0

Hei maailma.

1

Miten voit tänään?

2

Taivas on sininen ja tuuli on lempeä.

3

Koneoppiminen auttaa tietokoneita oppimaan datasta.

4

Puhesynteesi muuttaa tekstin selkeäksi ääneksi.

5

Oppilaat lukivat lyhyen tarinan kirjastossa.

6

Juna myöhästyi raiteiden huollon vuoksi.

7

Pienet mallit toimivat nopeasti paikallisilla laitteilla.

8

Ääniavustaja auttaa päivittäisissä tehtävissä.

9

Vakaa lukeminen on tärkeää sekä lyhyille että pitkille lauseille.

French

This section lists text to speech models for French.

vits-piper-fr_FR-gilles-low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fr/fr_FR/gilles/low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fr_FR-gilles-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-fr_FR-gilles-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx";
  config.model.vits.tokens = "vits-piper-fr_FR-gilles-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fr_FR-gilles-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Pas de nouvelles, bonnes nouvelles.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fr_FR-gilles-low.tar.bz2

You can use the following code to play with vits-piper-fr_FR-gilles-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx",
            data_dir="vits-piper-fr_FR-gilles-low/espeak-ng-data",
            tokens="vits-piper-fr_FR-gilles-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Pas de nouvelles, bonnes nouvelles.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-fr_FR-gilles-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx";
  config.model.vits.tokens = "vits-piper-fr_FR-gilles-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fr_FR-gilles-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Pas de nouvelles, bonnes nouvelles.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-fr_FR-gilles-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx".into()),
                tokens: Some("vits-piper-fr_FR-gilles-low/tokens.txt".into()),
                data_dir: Some("vits-piper-fr_FR-gilles-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Pas de nouvelles, bonnes nouvelles.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-fr_FR-gilles-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx',
        tokens: 'vits-piper-fr_FR-gilles-low/tokens.txt',
        dataDir: 'vits-piper-fr_FR-gilles-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Pas de nouvelles, bonnes nouvelles.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-fr_FR-gilles-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx',
    tokens: 'vits-piper-fr_FR-gilles-low/tokens.txt',
    dataDir: 'vits-piper-fr_FR-gilles-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Pas de nouvelles, bonnes nouvelles.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-fr_FR-gilles-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx",
    lexicon: "",
    tokens: "vits-piper-fr_FR-gilles-low/tokens.txt",
    dataDir: "vits-piper-fr_FR-gilles-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Pas de nouvelles, bonnes nouvelles."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-fr_FR-gilles-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx";
config.Model.Vits.Tokens = "vits-piper-fr_FR-gilles-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fr_FR-gilles-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-fr_FR-gilles-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx",
        tokens = "vits-piper-fr_FR-gilles-low/tokens.txt",
        dataDir = "vits-piper-fr_FR-gilles-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Pas de nouvelles, bonnes nouvelles.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-fr_FR-gilles-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx");
    vits.setTokens("vits-piper-fr_FR-gilles-low/tokens.txt");
    vits.setDataDir("vits-piper-fr_FR-gilles-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Pas de nouvelles, bonnes nouvelles.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-fr_FR-gilles-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-fr_FR-gilles-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-fr_FR-gilles-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Pas de nouvelles, bonnes nouvelles.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-fr_FR-gilles-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx",
				Tokens:  "vits-piper-fr_FR-gilles-low/tokens.txt",
				DataDir: "vits-piper-fr_FR-gilles-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Pas de nouvelles, bonnes nouvelles."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Pas de nouvelles, bonnes nouvelles.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-fr_FR-miro-high

Info about this model

This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_fr-FR_miro

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fr_FR-miro-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-fr_FR-miro-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx";
  config.model.vits.tokens = "vits-piper-fr_FR-miro-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fr_FR-miro-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Pas de nouvelles, bonnes nouvelles.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fr_FR-miro-high.tar.bz2

You can use the following code to play with vits-piper-fr_FR-miro-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx",
            data_dir="vits-piper-fr_FR-miro-high/espeak-ng-data",
            tokens="vits-piper-fr_FR-miro-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Pas de nouvelles, bonnes nouvelles.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-fr_FR-miro-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx";
  config.model.vits.tokens = "vits-piper-fr_FR-miro-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fr_FR-miro-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Pas de nouvelles, bonnes nouvelles.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-fr_FR-miro-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx".into()),
                tokens: Some("vits-piper-fr_FR-miro-high/tokens.txt".into()),
                data_dir: Some("vits-piper-fr_FR-miro-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Pas de nouvelles, bonnes nouvelles.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-fr_FR-miro-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx',
        tokens: 'vits-piper-fr_FR-miro-high/tokens.txt',
        dataDir: 'vits-piper-fr_FR-miro-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Pas de nouvelles, bonnes nouvelles.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-fr_FR-miro-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx',
    tokens: 'vits-piper-fr_FR-miro-high/tokens.txt',
    dataDir: 'vits-piper-fr_FR-miro-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Pas de nouvelles, bonnes nouvelles.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-fr_FR-miro-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx",
    lexicon: "",
    tokens: "vits-piper-fr_FR-miro-high/tokens.txt",
    dataDir: "vits-piper-fr_FR-miro-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Pas de nouvelles, bonnes nouvelles."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-fr_FR-miro-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx";
config.Model.Vits.Tokens = "vits-piper-fr_FR-miro-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fr_FR-miro-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-fr_FR-miro-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx",
        tokens = "vits-piper-fr_FR-miro-high/tokens.txt",
        dataDir = "vits-piper-fr_FR-miro-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Pas de nouvelles, bonnes nouvelles.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-fr_FR-miro-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx");
    vits.setTokens("vits-piper-fr_FR-miro-high/tokens.txt");
    vits.setDataDir("vits-piper-fr_FR-miro-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Pas de nouvelles, bonnes nouvelles.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-fr_FR-miro-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-fr_FR-miro-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-fr_FR-miro-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Pas de nouvelles, bonnes nouvelles.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-fr_FR-miro-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx",
				Tokens:  "vits-piper-fr_FR-miro-high/tokens.txt",
				DataDir: "vits-piper-fr_FR-miro-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Pas de nouvelles, bonnes nouvelles."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Pas de nouvelles, bonnes nouvelles.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-fr_FR-siwis-low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fr/fr_FR/siwis/low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fr_FR-siwis-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-fr_FR-siwis-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx";
  config.model.vits.tokens = "vits-piper-fr_FR-siwis-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fr_FR-siwis-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Pas de nouvelles, bonnes nouvelles.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fr_FR-siwis-low.tar.bz2

You can use the following code to play with vits-piper-fr_FR-siwis-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx",
            data_dir="vits-piper-fr_FR-siwis-low/espeak-ng-data",
            tokens="vits-piper-fr_FR-siwis-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Pas de nouvelles, bonnes nouvelles.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-fr_FR-siwis-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx";
  config.model.vits.tokens = "vits-piper-fr_FR-siwis-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fr_FR-siwis-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Pas de nouvelles, bonnes nouvelles.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-fr_FR-siwis-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx".into()),
                tokens: Some("vits-piper-fr_FR-siwis-low/tokens.txt".into()),
                data_dir: Some("vits-piper-fr_FR-siwis-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Pas de nouvelles, bonnes nouvelles.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-fr_FR-siwis-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx',
        tokens: 'vits-piper-fr_FR-siwis-low/tokens.txt',
        dataDir: 'vits-piper-fr_FR-siwis-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Pas de nouvelles, bonnes nouvelles.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-fr_FR-siwis-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx',
    tokens: 'vits-piper-fr_FR-siwis-low/tokens.txt',
    dataDir: 'vits-piper-fr_FR-siwis-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Pas de nouvelles, bonnes nouvelles.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-fr_FR-siwis-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx",
    lexicon: "",
    tokens: "vits-piper-fr_FR-siwis-low/tokens.txt",
    dataDir: "vits-piper-fr_FR-siwis-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Pas de nouvelles, bonnes nouvelles."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-fr_FR-siwis-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx";
config.Model.Vits.Tokens = "vits-piper-fr_FR-siwis-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fr_FR-siwis-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-fr_FR-siwis-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx",
        tokens = "vits-piper-fr_FR-siwis-low/tokens.txt",
        dataDir = "vits-piper-fr_FR-siwis-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Pas de nouvelles, bonnes nouvelles.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-fr_FR-siwis-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx");
    vits.setTokens("vits-piper-fr_FR-siwis-low/tokens.txt");
    vits.setDataDir("vits-piper-fr_FR-siwis-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Pas de nouvelles, bonnes nouvelles.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-fr_FR-siwis-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-fr_FR-siwis-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-fr_FR-siwis-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Pas de nouvelles, bonnes nouvelles.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-fr_FR-siwis-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx",
				Tokens:  "vits-piper-fr_FR-siwis-low/tokens.txt",
				DataDir: "vits-piper-fr_FR-siwis-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Pas de nouvelles, bonnes nouvelles."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Pas de nouvelles, bonnes nouvelles.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-fr_FR-siwis-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fr/fr_FR/siwis/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fr_FR-siwis-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-fr_FR-siwis-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx";
  config.model.vits.tokens = "vits-piper-fr_FR-siwis-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fr_FR-siwis-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Pas de nouvelles, bonnes nouvelles.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fr_FR-siwis-medium.tar.bz2

You can use the following code to play with vits-piper-fr_FR-siwis-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx",
            data_dir="vits-piper-fr_FR-siwis-medium/espeak-ng-data",
            tokens="vits-piper-fr_FR-siwis-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Pas de nouvelles, bonnes nouvelles.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-fr_FR-siwis-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx";
  config.model.vits.tokens = "vits-piper-fr_FR-siwis-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fr_FR-siwis-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Pas de nouvelles, bonnes nouvelles.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-fr_FR-siwis-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx".into()),
                tokens: Some("vits-piper-fr_FR-siwis-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-fr_FR-siwis-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Pas de nouvelles, bonnes nouvelles.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-fr_FR-siwis-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx',
        tokens: 'vits-piper-fr_FR-siwis-medium/tokens.txt',
        dataDir: 'vits-piper-fr_FR-siwis-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Pas de nouvelles, bonnes nouvelles.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-fr_FR-siwis-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx',
    tokens: 'vits-piper-fr_FR-siwis-medium/tokens.txt',
    dataDir: 'vits-piper-fr_FR-siwis-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Pas de nouvelles, bonnes nouvelles.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-fr_FR-siwis-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-fr_FR-siwis-medium/tokens.txt",
    dataDir: "vits-piper-fr_FR-siwis-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Pas de nouvelles, bonnes nouvelles."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-fr_FR-siwis-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-fr_FR-siwis-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fr_FR-siwis-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-fr_FR-siwis-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx",
        tokens = "vits-piper-fr_FR-siwis-medium/tokens.txt",
        dataDir = "vits-piper-fr_FR-siwis-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Pas de nouvelles, bonnes nouvelles.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-fr_FR-siwis-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx");
    vits.setTokens("vits-piper-fr_FR-siwis-medium/tokens.txt");
    vits.setDataDir("vits-piper-fr_FR-siwis-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Pas de nouvelles, bonnes nouvelles.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-fr_FR-siwis-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-fr_FR-siwis-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-fr_FR-siwis-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Pas de nouvelles, bonnes nouvelles.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-fr_FR-siwis-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx",
				Tokens:  "vits-piper-fr_FR-siwis-medium/tokens.txt",
				DataDir: "vits-piper-fr_FR-siwis-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Pas de nouvelles, bonnes nouvelles."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Pas de nouvelles, bonnes nouvelles.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-fr_FR-tjiho-model1

Info about this model

This model is converted from https://huggingface.co/csukuangfj/vits-piper-fr_FR-tjiho-model1/tree/main

Number of speakersSample rate
144100

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fr_FR-tjiho-model1.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx";
  config.model.vits.tokens = "vits-piper-fr_FR-tjiho-model1/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fr_FR-tjiho-model1/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Pas de nouvelles, bonnes nouvelles.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fr_FR-tjiho-model1.tar.bz2

You can use the following code to play with vits-piper-fr_FR-tjiho-model1

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx",
            data_dir="vits-piper-fr_FR-tjiho-model1/espeak-ng-data",
            tokens="vits-piper-fr_FR-tjiho-model1/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Pas de nouvelles, bonnes nouvelles.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx";
  config.model.vits.tokens = "vits-piper-fr_FR-tjiho-model1/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fr_FR-tjiho-model1/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Pas de nouvelles, bonnes nouvelles.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx".into()),
                tokens: Some("vits-piper-fr_FR-tjiho-model1/tokens.txt".into()),
                data_dir: Some("vits-piper-fr_FR-tjiho-model1/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Pas de nouvelles, bonnes nouvelles.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx',
        tokens: 'vits-piper-fr_FR-tjiho-model1/tokens.txt',
        dataDir: 'vits-piper-fr_FR-tjiho-model1/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Pas de nouvelles, bonnes nouvelles.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx',
    tokens: 'vits-piper-fr_FR-tjiho-model1/tokens.txt',
    dataDir: 'vits-piper-fr_FR-tjiho-model1/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Pas de nouvelles, bonnes nouvelles.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx",
    lexicon: "",
    tokens: "vits-piper-fr_FR-tjiho-model1/tokens.txt",
    dataDir: "vits-piper-fr_FR-tjiho-model1/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Pas de nouvelles, bonnes nouvelles."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx";
config.Model.Vits.Tokens = "vits-piper-fr_FR-tjiho-model1/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fr_FR-tjiho-model1/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx",
        tokens = "vits-piper-fr_FR-tjiho-model1/tokens.txt",
        dataDir = "vits-piper-fr_FR-tjiho-model1/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Pas de nouvelles, bonnes nouvelles.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx");
    vits.setTokens("vits-piper-fr_FR-tjiho-model1/tokens.txt");
    vits.setDataDir("vits-piper-fr_FR-tjiho-model1/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Pas de nouvelles, bonnes nouvelles.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-fr_FR-tjiho-model1/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-fr_FR-tjiho-model1/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Pas de nouvelles, bonnes nouvelles.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx",
				Tokens:  "vits-piper-fr_FR-tjiho-model1/tokens.txt",
				DataDir: "vits-piper-fr_FR-tjiho-model1/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Pas de nouvelles, bonnes nouvelles."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Pas de nouvelles, bonnes nouvelles.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-fr_FR-tjiho-model2

Info about this model

This model is converted from https://huggingface.co/csukuangfj/vits-piper-fr_FR-tjiho-model2/tree/main

Number of speakersSample rate
144100

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fr_FR-tjiho-model2.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx";
  config.model.vits.tokens = "vits-piper-fr_FR-tjiho-model2/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fr_FR-tjiho-model2/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Pas de nouvelles, bonnes nouvelles.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fr_FR-tjiho-model2.tar.bz2

You can use the following code to play with vits-piper-fr_FR-tjiho-model2

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx",
            data_dir="vits-piper-fr_FR-tjiho-model2/espeak-ng-data",
            tokens="vits-piper-fr_FR-tjiho-model2/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Pas de nouvelles, bonnes nouvelles.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx";
  config.model.vits.tokens = "vits-piper-fr_FR-tjiho-model2/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fr_FR-tjiho-model2/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Pas de nouvelles, bonnes nouvelles.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx".into()),
                tokens: Some("vits-piper-fr_FR-tjiho-model2/tokens.txt".into()),
                data_dir: Some("vits-piper-fr_FR-tjiho-model2/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Pas de nouvelles, bonnes nouvelles.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx',
        tokens: 'vits-piper-fr_FR-tjiho-model2/tokens.txt',
        dataDir: 'vits-piper-fr_FR-tjiho-model2/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Pas de nouvelles, bonnes nouvelles.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx',
    tokens: 'vits-piper-fr_FR-tjiho-model2/tokens.txt',
    dataDir: 'vits-piper-fr_FR-tjiho-model2/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Pas de nouvelles, bonnes nouvelles.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx",
    lexicon: "",
    tokens: "vits-piper-fr_FR-tjiho-model2/tokens.txt",
    dataDir: "vits-piper-fr_FR-tjiho-model2/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Pas de nouvelles, bonnes nouvelles."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx";
config.Model.Vits.Tokens = "vits-piper-fr_FR-tjiho-model2/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fr_FR-tjiho-model2/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx",
        tokens = "vits-piper-fr_FR-tjiho-model2/tokens.txt",
        dataDir = "vits-piper-fr_FR-tjiho-model2/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Pas de nouvelles, bonnes nouvelles.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx");
    vits.setTokens("vits-piper-fr_FR-tjiho-model2/tokens.txt");
    vits.setDataDir("vits-piper-fr_FR-tjiho-model2/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Pas de nouvelles, bonnes nouvelles.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-fr_FR-tjiho-model2/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-fr_FR-tjiho-model2/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Pas de nouvelles, bonnes nouvelles.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx",
				Tokens:  "vits-piper-fr_FR-tjiho-model2/tokens.txt",
				DataDir: "vits-piper-fr_FR-tjiho-model2/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Pas de nouvelles, bonnes nouvelles."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Pas de nouvelles, bonnes nouvelles.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-fr_FR-tjiho-model3

Info about this model

This model is converted from https://huggingface.co/csukuangfj/vits-piper-fr_FR-tjiho-model3/tree/main

Number of speakersSample rate
144100

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fr_FR-tjiho-model3.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx";
  config.model.vits.tokens = "vits-piper-fr_FR-tjiho-model3/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fr_FR-tjiho-model3/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Pas de nouvelles, bonnes nouvelles.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fr_FR-tjiho-model3.tar.bz2

You can use the following code to play with vits-piper-fr_FR-tjiho-model3

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx",
            data_dir="vits-piper-fr_FR-tjiho-model3/espeak-ng-data",
            tokens="vits-piper-fr_FR-tjiho-model3/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Pas de nouvelles, bonnes nouvelles.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx";
  config.model.vits.tokens = "vits-piper-fr_FR-tjiho-model3/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fr_FR-tjiho-model3/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Pas de nouvelles, bonnes nouvelles.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx".into()),
                tokens: Some("vits-piper-fr_FR-tjiho-model3/tokens.txt".into()),
                data_dir: Some("vits-piper-fr_FR-tjiho-model3/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Pas de nouvelles, bonnes nouvelles.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx',
        tokens: 'vits-piper-fr_FR-tjiho-model3/tokens.txt',
        dataDir: 'vits-piper-fr_FR-tjiho-model3/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Pas de nouvelles, bonnes nouvelles.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx',
    tokens: 'vits-piper-fr_FR-tjiho-model3/tokens.txt',
    dataDir: 'vits-piper-fr_FR-tjiho-model3/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Pas de nouvelles, bonnes nouvelles.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx",
    lexicon: "",
    tokens: "vits-piper-fr_FR-tjiho-model3/tokens.txt",
    dataDir: "vits-piper-fr_FR-tjiho-model3/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Pas de nouvelles, bonnes nouvelles."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx";
config.Model.Vits.Tokens = "vits-piper-fr_FR-tjiho-model3/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fr_FR-tjiho-model3/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx",
        tokens = "vits-piper-fr_FR-tjiho-model3/tokens.txt",
        dataDir = "vits-piper-fr_FR-tjiho-model3/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Pas de nouvelles, bonnes nouvelles.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx");
    vits.setTokens("vits-piper-fr_FR-tjiho-model3/tokens.txt");
    vits.setDataDir("vits-piper-fr_FR-tjiho-model3/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Pas de nouvelles, bonnes nouvelles.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-fr_FR-tjiho-model3/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-fr_FR-tjiho-model3/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Pas de nouvelles, bonnes nouvelles.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx",
				Tokens:  "vits-piper-fr_FR-tjiho-model3/tokens.txt",
				DataDir: "vits-piper-fr_FR-tjiho-model3/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Pas de nouvelles, bonnes nouvelles."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Pas de nouvelles, bonnes nouvelles.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-fr_FR-tom-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fr/fr_FR/tom/medium

Number of speakersSample rate
144100

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fr_FR-tom-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tom-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx";
  config.model.vits.tokens = "vits-piper-fr_FR-tom-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fr_FR-tom-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Pas de nouvelles, bonnes nouvelles.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fr_FR-tom-medium.tar.bz2

You can use the following code to play with vits-piper-fr_FR-tom-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx",
            data_dir="vits-piper-fr_FR-tom-medium/espeak-ng-data",
            tokens="vits-piper-fr_FR-tom-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Pas de nouvelles, bonnes nouvelles.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tom-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx";
  config.model.vits.tokens = "vits-piper-fr_FR-tom-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fr_FR-tom-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Pas de nouvelles, bonnes nouvelles.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tom-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx".into()),
                tokens: Some("vits-piper-fr_FR-tom-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-fr_FR-tom-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Pas de nouvelles, bonnes nouvelles.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-fr_FR-tom-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx',
        tokens: 'vits-piper-fr_FR-tom-medium/tokens.txt',
        dataDir: 'vits-piper-fr_FR-tom-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Pas de nouvelles, bonnes nouvelles.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tom-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx',
    tokens: 'vits-piper-fr_FR-tom-medium/tokens.txt',
    dataDir: 'vits-piper-fr_FR-tom-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Pas de nouvelles, bonnes nouvelles.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tom-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-fr_FR-tom-medium/tokens.txt",
    dataDir: "vits-piper-fr_FR-tom-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Pas de nouvelles, bonnes nouvelles."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tom-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-fr_FR-tom-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fr_FR-tom-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tom-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx",
        tokens = "vits-piper-fr_FR-tom-medium/tokens.txt",
        dataDir = "vits-piper-fr_FR-tom-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Pas de nouvelles, bonnes nouvelles.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tom-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx");
    vits.setTokens("vits-piper-fr_FR-tom-medium/tokens.txt");
    vits.setDataDir("vits-piper-fr_FR-tom-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Pas de nouvelles, bonnes nouvelles.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tom-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-fr_FR-tom-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-fr_FR-tom-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Pas de nouvelles, bonnes nouvelles.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-fr_FR-tom-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx",
				Tokens:  "vits-piper-fr_FR-tom-medium/tokens.txt",
				DataDir: "vits-piper-fr_FR-tom-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Pas de nouvelles, bonnes nouvelles."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Pas de nouvelles, bonnes nouvelles.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-fr_FR-upmc-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fr/fr_FR/upmc/medium

Number of speakersSample rate
222050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fr_FR-upmc-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-fr_FR-upmc-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx";
  config.model.vits.tokens = "vits-piper-fr_FR-upmc-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fr_FR-upmc-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Pas de nouvelles, bonnes nouvelles.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fr_FR-upmc-medium.tar.bz2

You can use the following code to play with vits-piper-fr_FR-upmc-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx",
            data_dir="vits-piper-fr_FR-upmc-medium/espeak-ng-data",
            tokens="vits-piper-fr_FR-upmc-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Pas de nouvelles, bonnes nouvelles.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-fr_FR-upmc-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx";
  config.model.vits.tokens = "vits-piper-fr_FR-upmc-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fr_FR-upmc-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Pas de nouvelles, bonnes nouvelles.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-fr_FR-upmc-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx".into()),
                tokens: Some("vits-piper-fr_FR-upmc-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-fr_FR-upmc-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Pas de nouvelles, bonnes nouvelles.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-fr_FR-upmc-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx',
        tokens: 'vits-piper-fr_FR-upmc-medium/tokens.txt',
        dataDir: 'vits-piper-fr_FR-upmc-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Pas de nouvelles, bonnes nouvelles.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-fr_FR-upmc-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx',
    tokens: 'vits-piper-fr_FR-upmc-medium/tokens.txt',
    dataDir: 'vits-piper-fr_FR-upmc-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Pas de nouvelles, bonnes nouvelles.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-fr_FR-upmc-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-fr_FR-upmc-medium/tokens.txt",
    dataDir: "vits-piper-fr_FR-upmc-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Pas de nouvelles, bonnes nouvelles."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-fr_FR-upmc-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-fr_FR-upmc-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fr_FR-upmc-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-fr_FR-upmc-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx",
        tokens = "vits-piper-fr_FR-upmc-medium/tokens.txt",
        dataDir = "vits-piper-fr_FR-upmc-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Pas de nouvelles, bonnes nouvelles.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-fr_FR-upmc-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx");
    vits.setTokens("vits-piper-fr_FR-upmc-medium/tokens.txt");
    vits.setDataDir("vits-piper-fr_FR-upmc-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Pas de nouvelles, bonnes nouvelles.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-fr_FR-upmc-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-fr_FR-upmc-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-fr_FR-upmc-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Pas de nouvelles, bonnes nouvelles.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-fr_FR-upmc-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx",
				Tokens:  "vits-piper-fr_FR-upmc-medium/tokens.txt",
				DataDir: "vits-piper-fr_FR-upmc-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Pas de nouvelles, bonnes nouvelles."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Pas de nouvelles, bonnes nouvelles.

sample audios for different speakers are listed below:

Speaker 0

Speaker 1

supertonic-3-fr

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for French (fr).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "fr"

audio = tts.generate("Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"fr\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "fr"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "fr"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'fr'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'fr'},
  );
  final audio = tts.generateWithConfig(text: 'Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "fr"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"fr\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "fr"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"fr\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "fr"}';

  Audio := Tts.GenerateWithConfig('Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "fr"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Bonjour le monde.

1

Comment allez-vous aujourd’hui?

2

Le ciel est bleu.

3

J’aime l’apprentissage automatique.

4

Python est incroyable.

5

Bonjour à tous.

6

L’intelligence artificielle grandit.

7

La synthèse vocale est fascinante.

8

Les réseaux neuronaux sont puissants.

9

Le texte en voix convertit le texte en audio.

10

Le rapide renard brun saute par-dessus le chien paresseux.

11

L’apprentissage automatique permet aux ordinateurs d’apprendre.

12

Le traitement du langage naturel aide les machines à comprendre.

13

L’apprentissage profond a révolutionné l’intelligence artificielle.

14

La technologie de synthèse vocale a considérablement progressé.

15

Le clonage vocal neuronal peut reproduire les styles de parole.

16

La normalisation du texte est importante pour la prononciation.

17

Les assistants vocaux nous aident à interagir avec la technologie.

18

Les systèmes TTS modernes utilisent l’apprentissage profond.

19

L’interaction homme machine est devenue plus intuitive.

Speaker 1

0

Bonjour le monde.

1

Comment allez-vous aujourd’hui?

2

Le ciel est bleu.

3

J’aime l’apprentissage automatique.

4

Python est incroyable.

5

Bonjour à tous.

6

L’intelligence artificielle grandit.

7

La synthèse vocale est fascinante.

8

Les réseaux neuronaux sont puissants.

9

Le texte en voix convertit le texte en audio.

10

Le rapide renard brun saute par-dessus le chien paresseux.

11

L’apprentissage automatique permet aux ordinateurs d’apprendre.

12

Le traitement du langage naturel aide les machines à comprendre.

13

L’apprentissage profond a révolutionné l’intelligence artificielle.

14

La technologie de synthèse vocale a considérablement progressé.

15

Le clonage vocal neuronal peut reproduire les styles de parole.

16

La normalisation du texte est importante pour la prononciation.

17

Les assistants vocaux nous aident à interagir avec la technologie.

18

Les systèmes TTS modernes utilisent l’apprentissage profond.

19

L’interaction homme machine est devenue plus intuitive.

Speaker 2

0

Bonjour le monde.

1

Comment allez-vous aujourd’hui?

2

Le ciel est bleu.

3

J’aime l’apprentissage automatique.

4

Python est incroyable.

5

Bonjour à tous.

6

L’intelligence artificielle grandit.

7

La synthèse vocale est fascinante.

8

Les réseaux neuronaux sont puissants.

9

Le texte en voix convertit le texte en audio.

10

Le rapide renard brun saute par-dessus le chien paresseux.

11

L’apprentissage automatique permet aux ordinateurs d’apprendre.

12

Le traitement du langage naturel aide les machines à comprendre.

13

L’apprentissage profond a révolutionné l’intelligence artificielle.

14

La technologie de synthèse vocale a considérablement progressé.

15

Le clonage vocal neuronal peut reproduire les styles de parole.

16

La normalisation du texte est importante pour la prononciation.

17

Les assistants vocaux nous aident à interagir avec la technologie.

18

Les systèmes TTS modernes utilisent l’apprentissage profond.

19

L’interaction homme machine est devenue plus intuitive.

Speaker 3

0

Bonjour le monde.

1

Comment allez-vous aujourd’hui?

2

Le ciel est bleu.

3

J’aime l’apprentissage automatique.

4

Python est incroyable.

5

Bonjour à tous.

6

L’intelligence artificielle grandit.

7

La synthèse vocale est fascinante.

8

Les réseaux neuronaux sont puissants.

9

Le texte en voix convertit le texte en audio.

10

Le rapide renard brun saute par-dessus le chien paresseux.

11

L’apprentissage automatique permet aux ordinateurs d’apprendre.

12

Le traitement du langage naturel aide les machines à comprendre.

13

L’apprentissage profond a révolutionné l’intelligence artificielle.

14

La technologie de synthèse vocale a considérablement progressé.

15

Le clonage vocal neuronal peut reproduire les styles de parole.

16

La normalisation du texte est importante pour la prononciation.

17

Les assistants vocaux nous aident à interagir avec la technologie.

18

Les systèmes TTS modernes utilisent l’apprentissage profond.

19

L’interaction homme machine est devenue plus intuitive.

Speaker 4

0

Bonjour le monde.

1

Comment allez-vous aujourd’hui?

2

Le ciel est bleu.

3

J’aime l’apprentissage automatique.

4

Python est incroyable.

5

Bonjour à tous.

6

L’intelligence artificielle grandit.

7

La synthèse vocale est fascinante.

8

Les réseaux neuronaux sont puissants.

9

Le texte en voix convertit le texte en audio.

10

Le rapide renard brun saute par-dessus le chien paresseux.

11

L’apprentissage automatique permet aux ordinateurs d’apprendre.

12

Le traitement du langage naturel aide les machines à comprendre.

13

L’apprentissage profond a révolutionné l’intelligence artificielle.

14

La technologie de synthèse vocale a considérablement progressé.

15

Le clonage vocal neuronal peut reproduire les styles de parole.

16

La normalisation du texte est importante pour la prononciation.

17

Les assistants vocaux nous aident à interagir avec la technologie.

18

Les systèmes TTS modernes utilisent l’apprentissage profond.

19

L’interaction homme machine est devenue plus intuitive.

Speaker 5

0

Bonjour le monde.

1

Comment allez-vous aujourd’hui?

2

Le ciel est bleu.

3

J’aime l’apprentissage automatique.

4

Python est incroyable.

5

Bonjour à tous.

6

L’intelligence artificielle grandit.

7

La synthèse vocale est fascinante.

8

Les réseaux neuronaux sont puissants.

9

Le texte en voix convertit le texte en audio.

10

Le rapide renard brun saute par-dessus le chien paresseux.

11

L’apprentissage automatique permet aux ordinateurs d’apprendre.

12

Le traitement du langage naturel aide les machines à comprendre.

13

L’apprentissage profond a révolutionné l’intelligence artificielle.

14

La technologie de synthèse vocale a considérablement progressé.

15

Le clonage vocal neuronal peut reproduire les styles de parole.

16

La normalisation du texte est importante pour la prononciation.

17

Les assistants vocaux nous aident à interagir avec la technologie.

18

Les systèmes TTS modernes utilisent l’apprentissage profond.

19

L’interaction homme machine est devenue plus intuitive.

Speaker 6

0

Bonjour le monde.

1

Comment allez-vous aujourd’hui?

2

Le ciel est bleu.

3

J’aime l’apprentissage automatique.

4

Python est incroyable.

5

Bonjour à tous.

6

L’intelligence artificielle grandit.

7

La synthèse vocale est fascinante.

8

Les réseaux neuronaux sont puissants.

9

Le texte en voix convertit le texte en audio.

10

Le rapide renard brun saute par-dessus le chien paresseux.

11

L’apprentissage automatique permet aux ordinateurs d’apprendre.

12

Le traitement du langage naturel aide les machines à comprendre.

13

L’apprentissage profond a révolutionné l’intelligence artificielle.

14

La technologie de synthèse vocale a considérablement progressé.

15

Le clonage vocal neuronal peut reproduire les styles de parole.

16

La normalisation du texte est importante pour la prononciation.

17

Les assistants vocaux nous aident à interagir avec la technologie.

18

Les systèmes TTS modernes utilisent l’apprentissage profond.

19

L’interaction homme machine est devenue plus intuitive.

Speaker 7

0

Bonjour le monde.

1

Comment allez-vous aujourd’hui?

2

Le ciel est bleu.

3

J’aime l’apprentissage automatique.

4

Python est incroyable.

5

Bonjour à tous.

6

L’intelligence artificielle grandit.

7

La synthèse vocale est fascinante.

8

Les réseaux neuronaux sont puissants.

9

Le texte en voix convertit le texte en audio.

10

Le rapide renard brun saute par-dessus le chien paresseux.

11

L’apprentissage automatique permet aux ordinateurs d’apprendre.

12

Le traitement du langage naturel aide les machines à comprendre.

13

L’apprentissage profond a révolutionné l’intelligence artificielle.

14

La technologie de synthèse vocale a considérablement progressé.

15

Le clonage vocal neuronal peut reproduire les styles de parole.

16

La normalisation du texte est importante pour la prononciation.

17

Les assistants vocaux nous aident à interagir avec la technologie.

18

Les systèmes TTS modernes utilisent l’apprentissage profond.

19

L’interaction homme machine est devenue plus intuitive.

Speaker 8

0

Bonjour le monde.

1

Comment allez-vous aujourd’hui?

2

Le ciel est bleu.

3

J’aime l’apprentissage automatique.

4

Python est incroyable.

5

Bonjour à tous.

6

L’intelligence artificielle grandit.

7

La synthèse vocale est fascinante.

8

Les réseaux neuronaux sont puissants.

9

Le texte en voix convertit le texte en audio.

10

Le rapide renard brun saute par-dessus le chien paresseux.

11

L’apprentissage automatique permet aux ordinateurs d’apprendre.

12

Le traitement du langage naturel aide les machines à comprendre.

13

L’apprentissage profond a révolutionné l’intelligence artificielle.

14

La technologie de synthèse vocale a considérablement progressé.

15

Le clonage vocal neuronal peut reproduire les styles de parole.

16

La normalisation du texte est importante pour la prononciation.

17

Les assistants vocaux nous aident à interagir avec la technologie.

18

Les systèmes TTS modernes utilisent l’apprentissage profond.

19

L’interaction homme machine est devenue plus intuitive.

Speaker 9

0

Bonjour le monde.

1

Comment allez-vous aujourd’hui?

2

Le ciel est bleu.

3

J’aime l’apprentissage automatique.

4

Python est incroyable.

5

Bonjour à tous.

6

L’intelligence artificielle grandit.

7

La synthèse vocale est fascinante.

8

Les réseaux neuronaux sont puissants.

9

Le texte en voix convertit le texte en audio.

10

Le rapide renard brun saute par-dessus le chien paresseux.

11

L’apprentissage automatique permet aux ordinateurs d’apprendre.

12

Le traitement du langage naturel aide les machines à comprendre.

13

L’apprentissage profond a révolutionné l’intelligence artificielle.

14

La technologie de synthèse vocale a considérablement progressé.

15

Le clonage vocal neuronal peut reproduire les styles de parole.

16

La normalisation du texte est importante pour la prononciation.

17

Les assistants vocaux nous aident à interagir avec la technologie.

18

Les systèmes TTS modernes utilisent l’apprentissage profond.

19

L’interaction homme machine est devenue plus intuitive.

Georgian

This section lists text to speech models for Georgian.

vits-piper-ka_GE-natia-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ka/ka_GE/natia/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ka_GE-natia-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-ka_GE-natia-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx";
  config.model.vits.tokens = "vits-piper-ka_GE-natia-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ka_GE-natia-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "ღვინო თბილისში, საქართველო სამტრედში";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ka_GE-natia-medium.tar.bz2

You can use the following code to play with vits-piper-ka_GE-natia-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx",
            data_dir="vits-piper-ka_GE-natia-medium/espeak-ng-data",
            tokens="vits-piper-ka_GE-natia-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="ღვინო თბილისში, საქართველო სამტრედში",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-ka_GE-natia-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx";
  config.model.vits.tokens = "vits-piper-ka_GE-natia-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ka_GE-natia-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "ღვინო თბილისში, საქართველო სამტრედში";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-ka_GE-natia-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx".into()),
                tokens: Some("vits-piper-ka_GE-natia-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-ka_GE-natia-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "ღვინო თბილისში, საქართველო სამტრედში";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-ka_GE-natia-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx',
        tokens: 'vits-piper-ka_GE-natia-medium/tokens.txt',
        dataDir: 'vits-piper-ka_GE-natia-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'ღვინო თბილისში, საქართველო სამტრედში';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-ka_GE-natia-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx',
    tokens: 'vits-piper-ka_GE-natia-medium/tokens.txt',
    dataDir: 'vits-piper-ka_GE-natia-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'ღვინო თბილისში, საქართველო სამტრედში', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-ka_GE-natia-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-ka_GE-natia-medium/tokens.txt",
    dataDir: "vits-piper-ka_GE-natia-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "ღვინო თბილისში, საქართველო სამტრედში"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-ka_GE-natia-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ka_GE-natia-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ka_GE-natia-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "ღვინო თბილისში, საქართველო სამტრედში";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-ka_GE-natia-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx",
        tokens = "vits-piper-ka_GE-natia-medium/tokens.txt",
        dataDir = "vits-piper-ka_GE-natia-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "ღვინო თბილისში, საქართველო სამტრედში",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-ka_GE-natia-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx");
    vits.setTokens("vits-piper-ka_GE-natia-medium/tokens.txt");
    vits.setDataDir("vits-piper-ka_GE-natia-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "ღვინო თბილისში, საქართველო სამტრედში";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-ka_GE-natia-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-ka_GE-natia-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-ka_GE-natia-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('ღვინო თბილისში, საქართველო სამტრედში', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-ka_GE-natia-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx",
				Tokens:  "vits-piper-ka_GE-natia-medium/tokens.txt",
				DataDir: "vits-piper-ka_GE-natia-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "ღვინო თბილისში, საქართველო სამტრედში"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

ღვინო თბილისში, საქართველო სამტრედში

sample audios for different speakers are listed below:

Speaker 0

German

This section lists text to speech models for German.

vits-piper-de_DE-dii-high

Info about this model

This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_de-DE_dii

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-dii-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-de_DE-dii-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-de_DE-dii-high/de_DE-dii-high.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-dii-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-dii-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-dii-high.tar.bz2

You can use the following code to play with vits-piper-de_DE-dii-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-de_DE-dii-high/de_DE-dii-high.onnx",
            data_dir="vits-piper-de_DE-dii-high/espeak-ng-data",
            tokens="vits-piper-de_DE-dii-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-de_DE-dii-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-de_DE-dii-high/de_DE-dii-high.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-dii-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-dii-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-de_DE-dii-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-de_DE-dii-high/de_DE-dii-high.onnx".into()),
                tokens: Some("vits-piper-de_DE-dii-high/tokens.txt".into()),
                data_dir: Some("vits-piper-de_DE-dii-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-de_DE-dii-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-de_DE-dii-high/de_DE-dii-high.onnx',
        tokens: 'vits-piper-de_DE-dii-high/tokens.txt',
        dataDir: 'vits-piper-de_DE-dii-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-de_DE-dii-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-de_DE-dii-high/de_DE-dii-high.onnx',
    tokens: 'vits-piper-de_DE-dii-high/tokens.txt',
    dataDir: 'vits-piper-de_DE-dii-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-de_DE-dii-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-de_DE-dii-high/de_DE-dii-high.onnx",
    lexicon: "",
    tokens: "vits-piper-de_DE-dii-high/tokens.txt",
    dataDir: "vits-piper-de_DE-dii-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Alles hat ein Ende, nur die Wurst hat zwei."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-de_DE-dii-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-dii-high/de_DE-dii-high.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-dii-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-dii-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-de_DE-dii-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-de_DE-dii-high/de_DE-dii-high.onnx",
        tokens = "vits-piper-de_DE-dii-high/tokens.txt",
        dataDir = "vits-piper-de_DE-dii-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Alles hat ein Ende, nur die Wurst hat zwei.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-de_DE-dii-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-de_DE-dii-high/de_DE-dii-high.onnx");
    vits.setTokens("vits-piper-de_DE-dii-high/tokens.txt");
    vits.setDataDir("vits-piper-de_DE-dii-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-de_DE-dii-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-de_DE-dii-high/de_DE-dii-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-de_DE-dii-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-de_DE-dii-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-de_DE-dii-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-de_DE-dii-high/de_DE-dii-high.onnx",
				Tokens:  "vits-piper-de_DE-dii-high/tokens.txt",
				DataDir: "vits-piper-de_DE-dii-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Alles hat ein Ende, nur die Wurst hat zwei."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Alles hat ein Ende, nur die Wurst hat zwei.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-de_DE-eva_k-x_low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/de/de_DE/eva_k/x_low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-eva_k-x_low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-de_DE-eva_k-x_low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-eva_k-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-eva_k-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-eva_k-x_low.tar.bz2

You can use the following code to play with vits-piper-de_DE-eva_k-x_low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx",
            data_dir="vits-piper-de_DE-eva_k-x_low/espeak-ng-data",
            tokens="vits-piper-de_DE-eva_k-x_low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-de_DE-eva_k-x_low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-eva_k-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-eva_k-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-de_DE-eva_k-x_low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx".into()),
                tokens: Some("vits-piper-de_DE-eva_k-x_low/tokens.txt".into()),
                data_dir: Some("vits-piper-de_DE-eva_k-x_low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-de_DE-eva_k-x_low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx',
        tokens: 'vits-piper-de_DE-eva_k-x_low/tokens.txt',
        dataDir: 'vits-piper-de_DE-eva_k-x_low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-de_DE-eva_k-x_low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx',
    tokens: 'vits-piper-de_DE-eva_k-x_low/tokens.txt',
    dataDir: 'vits-piper-de_DE-eva_k-x_low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-de_DE-eva_k-x_low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx",
    lexicon: "",
    tokens: "vits-piper-de_DE-eva_k-x_low/tokens.txt",
    dataDir: "vits-piper-de_DE-eva_k-x_low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Alles hat ein Ende, nur die Wurst hat zwei."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-de_DE-eva_k-x_low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-eva_k-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-eva_k-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-de_DE-eva_k-x_low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx",
        tokens = "vits-piper-de_DE-eva_k-x_low/tokens.txt",
        dataDir = "vits-piper-de_DE-eva_k-x_low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Alles hat ein Ende, nur die Wurst hat zwei.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-de_DE-eva_k-x_low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx");
    vits.setTokens("vits-piper-de_DE-eva_k-x_low/tokens.txt");
    vits.setDataDir("vits-piper-de_DE-eva_k-x_low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-de_DE-eva_k-x_low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-de_DE-eva_k-x_low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-de_DE-eva_k-x_low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-de_DE-eva_k-x_low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx",
				Tokens:  "vits-piper-de_DE-eva_k-x_low/tokens.txt",
				DataDir: "vits-piper-de_DE-eva_k-x_low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Alles hat ein Ende, nur die Wurst hat zwei."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Alles hat ein Ende, nur die Wurst hat zwei.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-de_DE-glados-high

Info about this model

This model is converted from https://huggingface.co/systemofapwne/piper-de-glados

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-glados-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-de_DE-glados-high/de_DE-glados-high.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-glados-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-glados-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-glados-high.tar.bz2

You can use the following code to play with vits-piper-de_DE-glados-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-de_DE-glados-high/de_DE-glados-high.onnx",
            data_dir="vits-piper-de_DE-glados-high/espeak-ng-data",
            tokens="vits-piper-de_DE-glados-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-de_DE-glados-high/de_DE-glados-high.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-glados-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-glados-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-de_DE-glados-high/de_DE-glados-high.onnx".into()),
                tokens: Some("vits-piper-de_DE-glados-high/tokens.txt".into()),
                data_dir: Some("vits-piper-de_DE-glados-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-de_DE-glados-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-de_DE-glados-high/de_DE-glados-high.onnx',
        tokens: 'vits-piper-de_DE-glados-high/tokens.txt',
        dataDir: 'vits-piper-de_DE-glados-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-de_DE-glados-high/de_DE-glados-high.onnx',
    tokens: 'vits-piper-de_DE-glados-high/tokens.txt',
    dataDir: 'vits-piper-de_DE-glados-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-de_DE-glados-high/de_DE-glados-high.onnx",
    lexicon: "",
    tokens: "vits-piper-de_DE-glados-high/tokens.txt",
    dataDir: "vits-piper-de_DE-glados-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Alles hat ein Ende, nur die Wurst hat zwei."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-glados-high/de_DE-glados-high.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-glados-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-glados-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-de_DE-glados-high/de_DE-glados-high.onnx",
        tokens = "vits-piper-de_DE-glados-high/tokens.txt",
        dataDir = "vits-piper-de_DE-glados-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Alles hat ein Ende, nur die Wurst hat zwei.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-de_DE-glados-high/de_DE-glados-high.onnx");
    vits.setTokens("vits-piper-de_DE-glados-high/tokens.txt");
    vits.setDataDir("vits-piper-de_DE-glados-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-de_DE-glados-high/de_DE-glados-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-de_DE-glados-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-de_DE-glados-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-de_DE-glados-high/de_DE-glados-high.onnx",
				Tokens:  "vits-piper-de_DE-glados-high/tokens.txt",
				DataDir: "vits-piper-de_DE-glados-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Alles hat ein Ende, nur die Wurst hat zwei."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Alles hat ein Ende, nur die Wurst hat zwei.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-de_DE-glados-low

Info about this model

This model is converted from https://huggingface.co/systemofapwne/piper-de-glados

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-glados-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-de_DE-glados-low/de_DE-glados-low.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-glados-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-glados-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-glados-low.tar.bz2

You can use the following code to play with vits-piper-de_DE-glados-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-de_DE-glados-low/de_DE-glados-low.onnx",
            data_dir="vits-piper-de_DE-glados-low/espeak-ng-data",
            tokens="vits-piper-de_DE-glados-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-de_DE-glados-low/de_DE-glados-low.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-glados-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-glados-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-de_DE-glados-low/de_DE-glados-low.onnx".into()),
                tokens: Some("vits-piper-de_DE-glados-low/tokens.txt".into()),
                data_dir: Some("vits-piper-de_DE-glados-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-de_DE-glados-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-de_DE-glados-low/de_DE-glados-low.onnx',
        tokens: 'vits-piper-de_DE-glados-low/tokens.txt',
        dataDir: 'vits-piper-de_DE-glados-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-de_DE-glados-low/de_DE-glados-low.onnx',
    tokens: 'vits-piper-de_DE-glados-low/tokens.txt',
    dataDir: 'vits-piper-de_DE-glados-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-de_DE-glados-low/de_DE-glados-low.onnx",
    lexicon: "",
    tokens: "vits-piper-de_DE-glados-low/tokens.txt",
    dataDir: "vits-piper-de_DE-glados-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Alles hat ein Ende, nur die Wurst hat zwei."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-glados-low/de_DE-glados-low.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-glados-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-glados-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-de_DE-glados-low/de_DE-glados-low.onnx",
        tokens = "vits-piper-de_DE-glados-low/tokens.txt",
        dataDir = "vits-piper-de_DE-glados-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Alles hat ein Ende, nur die Wurst hat zwei.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-de_DE-glados-low/de_DE-glados-low.onnx");
    vits.setTokens("vits-piper-de_DE-glados-low/tokens.txt");
    vits.setDataDir("vits-piper-de_DE-glados-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-de_DE-glados-low/de_DE-glados-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-de_DE-glados-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-de_DE-glados-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-de_DE-glados-low/de_DE-glados-low.onnx",
				Tokens:  "vits-piper-de_DE-glados-low/tokens.txt",
				DataDir: "vits-piper-de_DE-glados-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Alles hat ein Ende, nur die Wurst hat zwei."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Alles hat ein Ende, nur die Wurst hat zwei.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-de_DE-glados-medium

Info about this model

This model is converted from https://huggingface.co/systemofapwne/piper-de-glados

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-glados-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-glados-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-glados-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-glados-medium.tar.bz2

You can use the following code to play with vits-piper-de_DE-glados-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx",
            data_dir="vits-piper-de_DE-glados-medium/espeak-ng-data",
            tokens="vits-piper-de_DE-glados-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-glados-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-glados-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx".into()),
                tokens: Some("vits-piper-de_DE-glados-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-de_DE-glados-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-de_DE-glados-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx',
        tokens: 'vits-piper-de_DE-glados-medium/tokens.txt',
        dataDir: 'vits-piper-de_DE-glados-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx',
    tokens: 'vits-piper-de_DE-glados-medium/tokens.txt',
    dataDir: 'vits-piper-de_DE-glados-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-de_DE-glados-medium/tokens.txt",
    dataDir: "vits-piper-de_DE-glados-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Alles hat ein Ende, nur die Wurst hat zwei."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-glados-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-glados-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx",
        tokens = "vits-piper-de_DE-glados-medium/tokens.txt",
        dataDir = "vits-piper-de_DE-glados-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Alles hat ein Ende, nur die Wurst hat zwei.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx");
    vits.setTokens("vits-piper-de_DE-glados-medium/tokens.txt");
    vits.setDataDir("vits-piper-de_DE-glados-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-de_DE-glados-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-de_DE-glados-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx",
				Tokens:  "vits-piper-de_DE-glados-medium/tokens.txt",
				DataDir: "vits-piper-de_DE-glados-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Alles hat ein Ende, nur die Wurst hat zwei."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Alles hat ein Ende, nur die Wurst hat zwei.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-de_DE-glados_turret-high

Info about this model

This model is converted from https://huggingface.co/systemofapwne/piper-de-glados

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-glados_turret-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-glados_turret-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-glados_turret-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-glados_turret-high.tar.bz2

You can use the following code to play with vits-piper-de_DE-glados_turret-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx",
            data_dir="vits-piper-de_DE-glados_turret-high/espeak-ng-data",
            tokens="vits-piper-de_DE-glados_turret-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-glados_turret-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-glados_turret-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx".into()),
                tokens: Some("vits-piper-de_DE-glados_turret-high/tokens.txt".into()),
                data_dir: Some("vits-piper-de_DE-glados_turret-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-de_DE-glados_turret-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx',
        tokens: 'vits-piper-de_DE-glados_turret-high/tokens.txt',
        dataDir: 'vits-piper-de_DE-glados_turret-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx',
    tokens: 'vits-piper-de_DE-glados_turret-high/tokens.txt',
    dataDir: 'vits-piper-de_DE-glados_turret-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx",
    lexicon: "",
    tokens: "vits-piper-de_DE-glados_turret-high/tokens.txt",
    dataDir: "vits-piper-de_DE-glados_turret-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Alles hat ein Ende, nur die Wurst hat zwei."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-glados_turret-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-glados_turret-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx",
        tokens = "vits-piper-de_DE-glados_turret-high/tokens.txt",
        dataDir = "vits-piper-de_DE-glados_turret-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Alles hat ein Ende, nur die Wurst hat zwei.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx");
    vits.setTokens("vits-piper-de_DE-glados_turret-high/tokens.txt");
    vits.setDataDir("vits-piper-de_DE-glados_turret-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-de_DE-glados_turret-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-de_DE-glados_turret-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx",
				Tokens:  "vits-piper-de_DE-glados_turret-high/tokens.txt",
				DataDir: "vits-piper-de_DE-glados_turret-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Alles hat ein Ende, nur die Wurst hat zwei."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Alles hat ein Ende, nur die Wurst hat zwei.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-de_DE-glados_turret-low

Info about this model

This model is converted from https://huggingface.co/systemofapwne/piper-de-glados

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-glados_turret-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-glados_turret-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-glados_turret-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-glados_turret-low.tar.bz2

You can use the following code to play with vits-piper-de_DE-glados_turret-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx",
            data_dir="vits-piper-de_DE-glados_turret-low/espeak-ng-data",
            tokens="vits-piper-de_DE-glados_turret-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-glados_turret-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-glados_turret-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx".into()),
                tokens: Some("vits-piper-de_DE-glados_turret-low/tokens.txt".into()),
                data_dir: Some("vits-piper-de_DE-glados_turret-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-de_DE-glados_turret-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx',
        tokens: 'vits-piper-de_DE-glados_turret-low/tokens.txt',
        dataDir: 'vits-piper-de_DE-glados_turret-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx',
    tokens: 'vits-piper-de_DE-glados_turret-low/tokens.txt',
    dataDir: 'vits-piper-de_DE-glados_turret-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx",
    lexicon: "",
    tokens: "vits-piper-de_DE-glados_turret-low/tokens.txt",
    dataDir: "vits-piper-de_DE-glados_turret-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Alles hat ein Ende, nur die Wurst hat zwei."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-glados_turret-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-glados_turret-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx",
        tokens = "vits-piper-de_DE-glados_turret-low/tokens.txt",
        dataDir = "vits-piper-de_DE-glados_turret-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Alles hat ein Ende, nur die Wurst hat zwei.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx");
    vits.setTokens("vits-piper-de_DE-glados_turret-low/tokens.txt");
    vits.setDataDir("vits-piper-de_DE-glados_turret-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-de_DE-glados_turret-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-de_DE-glados_turret-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx",
				Tokens:  "vits-piper-de_DE-glados_turret-low/tokens.txt",
				DataDir: "vits-piper-de_DE-glados_turret-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Alles hat ein Ende, nur die Wurst hat zwei."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Alles hat ein Ende, nur die Wurst hat zwei.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-de_DE-glados_turret-medium

Info about this model

This model is converted from https://huggingface.co/systemofapwne/piper-de-glados

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-glados_turret-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-glados_turret-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-glados_turret-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-glados_turret-medium.tar.bz2

You can use the following code to play with vits-piper-de_DE-glados_turret-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx",
            data_dir="vits-piper-de_DE-glados_turret-medium/espeak-ng-data",
            tokens="vits-piper-de_DE-glados_turret-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-glados_turret-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-glados_turret-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx".into()),
                tokens: Some("vits-piper-de_DE-glados_turret-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-de_DE-glados_turret-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-de_DE-glados_turret-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx',
        tokens: 'vits-piper-de_DE-glados_turret-medium/tokens.txt',
        dataDir: 'vits-piper-de_DE-glados_turret-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx',
    tokens: 'vits-piper-de_DE-glados_turret-medium/tokens.txt',
    dataDir: 'vits-piper-de_DE-glados_turret-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-de_DE-glados_turret-medium/tokens.txt",
    dataDir: "vits-piper-de_DE-glados_turret-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Alles hat ein Ende, nur die Wurst hat zwei."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-glados_turret-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-glados_turret-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx",
        tokens = "vits-piper-de_DE-glados_turret-medium/tokens.txt",
        dataDir = "vits-piper-de_DE-glados_turret-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Alles hat ein Ende, nur die Wurst hat zwei.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx");
    vits.setTokens("vits-piper-de_DE-glados_turret-medium/tokens.txt");
    vits.setDataDir("vits-piper-de_DE-glados_turret-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-de_DE-glados_turret-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-de_DE-glados_turret-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-de_DE-glados_turret-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx",
				Tokens:  "vits-piper-de_DE-glados_turret-medium/tokens.txt",
				DataDir: "vits-piper-de_DE-glados_turret-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Alles hat ein Ende, nur die Wurst hat zwei."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Alles hat ein Ende, nur die Wurst hat zwei.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-de_DE-karlsson-low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/de/de_DE/karlsson/low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-karlsson-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-de_DE-karlsson-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-karlsson-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-karlsson-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-karlsson-low.tar.bz2

You can use the following code to play with vits-piper-de_DE-karlsson-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx",
            data_dir="vits-piper-de_DE-karlsson-low/espeak-ng-data",
            tokens="vits-piper-de_DE-karlsson-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-de_DE-karlsson-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-karlsson-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-karlsson-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-de_DE-karlsson-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx".into()),
                tokens: Some("vits-piper-de_DE-karlsson-low/tokens.txt".into()),
                data_dir: Some("vits-piper-de_DE-karlsson-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-de_DE-karlsson-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx',
        tokens: 'vits-piper-de_DE-karlsson-low/tokens.txt',
        dataDir: 'vits-piper-de_DE-karlsson-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-de_DE-karlsson-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx',
    tokens: 'vits-piper-de_DE-karlsson-low/tokens.txt',
    dataDir: 'vits-piper-de_DE-karlsson-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-de_DE-karlsson-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx",
    lexicon: "",
    tokens: "vits-piper-de_DE-karlsson-low/tokens.txt",
    dataDir: "vits-piper-de_DE-karlsson-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Alles hat ein Ende, nur die Wurst hat zwei."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-de_DE-karlsson-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-karlsson-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-karlsson-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-de_DE-karlsson-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx",
        tokens = "vits-piper-de_DE-karlsson-low/tokens.txt",
        dataDir = "vits-piper-de_DE-karlsson-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Alles hat ein Ende, nur die Wurst hat zwei.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-de_DE-karlsson-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx");
    vits.setTokens("vits-piper-de_DE-karlsson-low/tokens.txt");
    vits.setDataDir("vits-piper-de_DE-karlsson-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-de_DE-karlsson-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-de_DE-karlsson-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-de_DE-karlsson-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-de_DE-karlsson-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx",
				Tokens:  "vits-piper-de_DE-karlsson-low/tokens.txt",
				DataDir: "vits-piper-de_DE-karlsson-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Alles hat ein Ende, nur die Wurst hat zwei."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Alles hat ein Ende, nur die Wurst hat zwei.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-de_DE-kerstin-low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/de/de_DE/kerstin/low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-kerstin-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-de_DE-kerstin-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-kerstin-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-kerstin-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-kerstin-low.tar.bz2

You can use the following code to play with vits-piper-de_DE-kerstin-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx",
            data_dir="vits-piper-de_DE-kerstin-low/espeak-ng-data",
            tokens="vits-piper-de_DE-kerstin-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-de_DE-kerstin-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-kerstin-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-kerstin-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-de_DE-kerstin-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx".into()),
                tokens: Some("vits-piper-de_DE-kerstin-low/tokens.txt".into()),
                data_dir: Some("vits-piper-de_DE-kerstin-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-de_DE-kerstin-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx',
        tokens: 'vits-piper-de_DE-kerstin-low/tokens.txt',
        dataDir: 'vits-piper-de_DE-kerstin-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-de_DE-kerstin-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx',
    tokens: 'vits-piper-de_DE-kerstin-low/tokens.txt',
    dataDir: 'vits-piper-de_DE-kerstin-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-de_DE-kerstin-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx",
    lexicon: "",
    tokens: "vits-piper-de_DE-kerstin-low/tokens.txt",
    dataDir: "vits-piper-de_DE-kerstin-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Alles hat ein Ende, nur die Wurst hat zwei."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-de_DE-kerstin-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-kerstin-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-kerstin-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-de_DE-kerstin-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx",
        tokens = "vits-piper-de_DE-kerstin-low/tokens.txt",
        dataDir = "vits-piper-de_DE-kerstin-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Alles hat ein Ende, nur die Wurst hat zwei.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-de_DE-kerstin-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx");
    vits.setTokens("vits-piper-de_DE-kerstin-low/tokens.txt");
    vits.setDataDir("vits-piper-de_DE-kerstin-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-de_DE-kerstin-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-de_DE-kerstin-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-de_DE-kerstin-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-de_DE-kerstin-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx",
				Tokens:  "vits-piper-de_DE-kerstin-low/tokens.txt",
				DataDir: "vits-piper-de_DE-kerstin-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Alles hat ein Ende, nur die Wurst hat zwei."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Alles hat ein Ende, nur die Wurst hat zwei.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-de_DE-miro-high

Info about this model

This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_de-DE_miro

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-miro-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-de_DE-miro-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-de_DE-miro-high/de_DE-miro-high.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-miro-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-miro-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-miro-high.tar.bz2

You can use the following code to play with vits-piper-de_DE-miro-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-de_DE-miro-high/de_DE-miro-high.onnx",
            data_dir="vits-piper-de_DE-miro-high/espeak-ng-data",
            tokens="vits-piper-de_DE-miro-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-de_DE-miro-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-de_DE-miro-high/de_DE-miro-high.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-miro-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-miro-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-de_DE-miro-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-de_DE-miro-high/de_DE-miro-high.onnx".into()),
                tokens: Some("vits-piper-de_DE-miro-high/tokens.txt".into()),
                data_dir: Some("vits-piper-de_DE-miro-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-de_DE-miro-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-de_DE-miro-high/de_DE-miro-high.onnx',
        tokens: 'vits-piper-de_DE-miro-high/tokens.txt',
        dataDir: 'vits-piper-de_DE-miro-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-de_DE-miro-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-de_DE-miro-high/de_DE-miro-high.onnx',
    tokens: 'vits-piper-de_DE-miro-high/tokens.txt',
    dataDir: 'vits-piper-de_DE-miro-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-de_DE-miro-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-de_DE-miro-high/de_DE-miro-high.onnx",
    lexicon: "",
    tokens: "vits-piper-de_DE-miro-high/tokens.txt",
    dataDir: "vits-piper-de_DE-miro-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Alles hat ein Ende, nur die Wurst hat zwei."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-de_DE-miro-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-miro-high/de_DE-miro-high.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-miro-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-miro-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-de_DE-miro-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-de_DE-miro-high/de_DE-miro-high.onnx",
        tokens = "vits-piper-de_DE-miro-high/tokens.txt",
        dataDir = "vits-piper-de_DE-miro-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Alles hat ein Ende, nur die Wurst hat zwei.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-de_DE-miro-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-de_DE-miro-high/de_DE-miro-high.onnx");
    vits.setTokens("vits-piper-de_DE-miro-high/tokens.txt");
    vits.setDataDir("vits-piper-de_DE-miro-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-de_DE-miro-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-de_DE-miro-high/de_DE-miro-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-de_DE-miro-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-de_DE-miro-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-de_DE-miro-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-de_DE-miro-high/de_DE-miro-high.onnx",
				Tokens:  "vits-piper-de_DE-miro-high/tokens.txt",
				DataDir: "vits-piper-de_DE-miro-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Alles hat ein Ende, nur die Wurst hat zwei."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Alles hat ein Ende, nur die Wurst hat zwei.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-de_DE-pavoque-low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/de/de_DE/pavoque/low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-pavoque-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-de_DE-pavoque-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-pavoque-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-pavoque-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-pavoque-low.tar.bz2

You can use the following code to play with vits-piper-de_DE-pavoque-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx",
            data_dir="vits-piper-de_DE-pavoque-low/espeak-ng-data",
            tokens="vits-piper-de_DE-pavoque-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-de_DE-pavoque-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-pavoque-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-pavoque-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-de_DE-pavoque-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx".into()),
                tokens: Some("vits-piper-de_DE-pavoque-low/tokens.txt".into()),
                data_dir: Some("vits-piper-de_DE-pavoque-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-de_DE-pavoque-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx',
        tokens: 'vits-piper-de_DE-pavoque-low/tokens.txt',
        dataDir: 'vits-piper-de_DE-pavoque-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-de_DE-pavoque-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx',
    tokens: 'vits-piper-de_DE-pavoque-low/tokens.txt',
    dataDir: 'vits-piper-de_DE-pavoque-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-de_DE-pavoque-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx",
    lexicon: "",
    tokens: "vits-piper-de_DE-pavoque-low/tokens.txt",
    dataDir: "vits-piper-de_DE-pavoque-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Alles hat ein Ende, nur die Wurst hat zwei."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-de_DE-pavoque-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-pavoque-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-pavoque-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-de_DE-pavoque-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx",
        tokens = "vits-piper-de_DE-pavoque-low/tokens.txt",
        dataDir = "vits-piper-de_DE-pavoque-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Alles hat ein Ende, nur die Wurst hat zwei.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-de_DE-pavoque-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx");
    vits.setTokens("vits-piper-de_DE-pavoque-low/tokens.txt");
    vits.setDataDir("vits-piper-de_DE-pavoque-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-de_DE-pavoque-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-de_DE-pavoque-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-de_DE-pavoque-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-de_DE-pavoque-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx",
				Tokens:  "vits-piper-de_DE-pavoque-low/tokens.txt",
				DataDir: "vits-piper-de_DE-pavoque-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Alles hat ein Ende, nur die Wurst hat zwei."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Alles hat ein Ende, nur die Wurst hat zwei.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-de_DE-ramona-low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/de/de_DE/ramona/low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-ramona-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-de_DE-ramona-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-ramona-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-ramona-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-ramona-low.tar.bz2

You can use the following code to play with vits-piper-de_DE-ramona-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx",
            data_dir="vits-piper-de_DE-ramona-low/espeak-ng-data",
            tokens="vits-piper-de_DE-ramona-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-de_DE-ramona-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-ramona-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-ramona-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-de_DE-ramona-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx".into()),
                tokens: Some("vits-piper-de_DE-ramona-low/tokens.txt".into()),
                data_dir: Some("vits-piper-de_DE-ramona-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-de_DE-ramona-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx',
        tokens: 'vits-piper-de_DE-ramona-low/tokens.txt',
        dataDir: 'vits-piper-de_DE-ramona-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-de_DE-ramona-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx',
    tokens: 'vits-piper-de_DE-ramona-low/tokens.txt',
    dataDir: 'vits-piper-de_DE-ramona-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-de_DE-ramona-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx",
    lexicon: "",
    tokens: "vits-piper-de_DE-ramona-low/tokens.txt",
    dataDir: "vits-piper-de_DE-ramona-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Alles hat ein Ende, nur die Wurst hat zwei."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-de_DE-ramona-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-ramona-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-ramona-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-de_DE-ramona-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx",
        tokens = "vits-piper-de_DE-ramona-low/tokens.txt",
        dataDir = "vits-piper-de_DE-ramona-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Alles hat ein Ende, nur die Wurst hat zwei.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-de_DE-ramona-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx");
    vits.setTokens("vits-piper-de_DE-ramona-low/tokens.txt");
    vits.setDataDir("vits-piper-de_DE-ramona-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-de_DE-ramona-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-de_DE-ramona-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-de_DE-ramona-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-de_DE-ramona-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx",
				Tokens:  "vits-piper-de_DE-ramona-low/tokens.txt",
				DataDir: "vits-piper-de_DE-ramona-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Alles hat ein Ende, nur die Wurst hat zwei."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Alles hat ein Ende, nur die Wurst hat zwei.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-de_DE-thorsten-high

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/de/de_DE/thorsten/high

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-thorsten-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-thorsten-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-thorsten-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-thorsten-high.tar.bz2

You can use the following code to play with vits-piper-de_DE-thorsten-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx",
            data_dir="vits-piper-de_DE-thorsten-high/espeak-ng-data",
            tokens="vits-piper-de_DE-thorsten-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-thorsten-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-thorsten-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx".into()),
                tokens: Some("vits-piper-de_DE-thorsten-high/tokens.txt".into()),
                data_dir: Some("vits-piper-de_DE-thorsten-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-de_DE-thorsten-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx',
        tokens: 'vits-piper-de_DE-thorsten-high/tokens.txt',
        dataDir: 'vits-piper-de_DE-thorsten-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx',
    tokens: 'vits-piper-de_DE-thorsten-high/tokens.txt',
    dataDir: 'vits-piper-de_DE-thorsten-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx",
    lexicon: "",
    tokens: "vits-piper-de_DE-thorsten-high/tokens.txt",
    dataDir: "vits-piper-de_DE-thorsten-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Alles hat ein Ende, nur die Wurst hat zwei."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-thorsten-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-thorsten-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx",
        tokens = "vits-piper-de_DE-thorsten-high/tokens.txt",
        dataDir = "vits-piper-de_DE-thorsten-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Alles hat ein Ende, nur die Wurst hat zwei.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx");
    vits.setTokens("vits-piper-de_DE-thorsten-high/tokens.txt");
    vits.setDataDir("vits-piper-de_DE-thorsten-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-de_DE-thorsten-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-de_DE-thorsten-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx",
				Tokens:  "vits-piper-de_DE-thorsten-high/tokens.txt",
				DataDir: "vits-piper-de_DE-thorsten-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Alles hat ein Ende, nur die Wurst hat zwei."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Alles hat ein Ende, nur die Wurst hat zwei.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-de_DE-thorsten-low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/de/de_DE/thorsten/low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-thorsten-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-thorsten-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-thorsten-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-thorsten-low.tar.bz2

You can use the following code to play with vits-piper-de_DE-thorsten-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx",
            data_dir="vits-piper-de_DE-thorsten-low/espeak-ng-data",
            tokens="vits-piper-de_DE-thorsten-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-thorsten-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-thorsten-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx".into()),
                tokens: Some("vits-piper-de_DE-thorsten-low/tokens.txt".into()),
                data_dir: Some("vits-piper-de_DE-thorsten-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-de_DE-thorsten-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx',
        tokens: 'vits-piper-de_DE-thorsten-low/tokens.txt',
        dataDir: 'vits-piper-de_DE-thorsten-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx',
    tokens: 'vits-piper-de_DE-thorsten-low/tokens.txt',
    dataDir: 'vits-piper-de_DE-thorsten-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx",
    lexicon: "",
    tokens: "vits-piper-de_DE-thorsten-low/tokens.txt",
    dataDir: "vits-piper-de_DE-thorsten-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Alles hat ein Ende, nur die Wurst hat zwei."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-thorsten-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-thorsten-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx",
        tokens = "vits-piper-de_DE-thorsten-low/tokens.txt",
        dataDir = "vits-piper-de_DE-thorsten-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Alles hat ein Ende, nur die Wurst hat zwei.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx");
    vits.setTokens("vits-piper-de_DE-thorsten-low/tokens.txt");
    vits.setDataDir("vits-piper-de_DE-thorsten-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-de_DE-thorsten-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-de_DE-thorsten-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx",
				Tokens:  "vits-piper-de_DE-thorsten-low/tokens.txt",
				DataDir: "vits-piper-de_DE-thorsten-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Alles hat ein Ende, nur die Wurst hat zwei."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Alles hat ein Ende, nur die Wurst hat zwei.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-de_DE-thorsten-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/de/de_DE/thorsten/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-thorsten-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-thorsten-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-thorsten-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-thorsten-medium.tar.bz2

You can use the following code to play with vits-piper-de_DE-thorsten-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx",
            data_dir="vits-piper-de_DE-thorsten-medium/espeak-ng-data",
            tokens="vits-piper-de_DE-thorsten-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-thorsten-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-thorsten-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx".into()),
                tokens: Some("vits-piper-de_DE-thorsten-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-de_DE-thorsten-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-de_DE-thorsten-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx',
        tokens: 'vits-piper-de_DE-thorsten-medium/tokens.txt',
        dataDir: 'vits-piper-de_DE-thorsten-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx',
    tokens: 'vits-piper-de_DE-thorsten-medium/tokens.txt',
    dataDir: 'vits-piper-de_DE-thorsten-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-de_DE-thorsten-medium/tokens.txt",
    dataDir: "vits-piper-de_DE-thorsten-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Alles hat ein Ende, nur die Wurst hat zwei."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-thorsten-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-thorsten-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx",
        tokens = "vits-piper-de_DE-thorsten-medium/tokens.txt",
        dataDir = "vits-piper-de_DE-thorsten-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Alles hat ein Ende, nur die Wurst hat zwei.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx");
    vits.setTokens("vits-piper-de_DE-thorsten-medium/tokens.txt");
    vits.setDataDir("vits-piper-de_DE-thorsten-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-de_DE-thorsten-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-de_DE-thorsten-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx",
				Tokens:  "vits-piper-de_DE-thorsten-medium/tokens.txt",
				DataDir: "vits-piper-de_DE-thorsten-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Alles hat ein Ende, nur die Wurst hat zwei."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Alles hat ein Ende, nur die Wurst hat zwei.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-de_DE-thorsten_emotional-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/de/de_DE/thorsten_emotional/medium

Number of speakersSample rate
822050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-thorsten_emotional-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-thorsten_emotional-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-thorsten_emotional-medium.tar.bz2

You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx",
            data_dir="vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data",
            tokens="vits-piper-de_DE-thorsten_emotional-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx";
  config.model.vits.tokens = "vits-piper-de_DE-thorsten_emotional-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx".into()),
                tokens: Some("vits-piper-de_DE-thorsten_emotional-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx',
        tokens: 'vits-piper-de_DE-thorsten_emotional-medium/tokens.txt',
        dataDir: 'vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx',
    tokens: 'vits-piper-de_DE-thorsten_emotional-medium/tokens.txt',
    dataDir: 'vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-de_DE-thorsten_emotional-medium/tokens.txt",
    dataDir: "vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Alles hat ein Ende, nur die Wurst hat zwei."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-thorsten_emotional-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx",
        tokens = "vits-piper-de_DE-thorsten_emotional-medium/tokens.txt",
        dataDir = "vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Alles hat ein Ende, nur die Wurst hat zwei.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx");
    vits.setTokens("vits-piper-de_DE-thorsten_emotional-medium/tokens.txt");
    vits.setDataDir("vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Alles hat ein Ende, nur die Wurst hat zwei.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-de_DE-thorsten_emotional-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx",
				Tokens:  "vits-piper-de_DE-thorsten_emotional-medium/tokens.txt",
				DataDir: "vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Alles hat ein Ende, nur die Wurst hat zwei."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Alles hat ein Ende, nur die Wurst hat zwei.

sample audios for different speakers are listed below:

Speaker 0

Speaker 1

Speaker 2

Speaker 3

Speaker 4

Speaker 5

Speaker 6

Speaker 7

supertonic-3-de

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for German (de).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "de"

audio = tts.generate("Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"de\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "de"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "de"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'de'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'de'},
  );
  final audio = tts.generateWithConfig(text: 'Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "de"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"de\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "de"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"de\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "de"}';

  Audio := Tts.GenerateWithConfig('Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "de"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Hallo Welt.

1

Wie geht es dir heute?

2

Der Himmel ist blau und der Wind ist mild.

3

Maschinelles Lernen hilft Computern, aus Daten zu lernen.

4

Sprachsynthese wandelt Text in klare Sprache um.

5

Die Schüler lasen am Morgen eine kurze Geschichte.

6

Der Zug hatte wegen Wartungsarbeiten Verspätung.

7

Kleine Modelle laufen schnell auf lokalen Geräten.

8

Ein Sprachassistent hilft bei alltäglichen Aufgaben.

9

Stabiles Vorlesen ist für kurze und lange Texte wichtig.

Speaker 1

0

Hallo Welt.

1

Wie geht es dir heute?

2

Der Himmel ist blau und der Wind ist mild.

3

Maschinelles Lernen hilft Computern, aus Daten zu lernen.

4

Sprachsynthese wandelt Text in klare Sprache um.

5

Die Schüler lasen am Morgen eine kurze Geschichte.

6

Der Zug hatte wegen Wartungsarbeiten Verspätung.

7

Kleine Modelle laufen schnell auf lokalen Geräten.

8

Ein Sprachassistent hilft bei alltäglichen Aufgaben.

9

Stabiles Vorlesen ist für kurze und lange Texte wichtig.

Speaker 2

0

Hallo Welt.

1

Wie geht es dir heute?

2

Der Himmel ist blau und der Wind ist mild.

3

Maschinelles Lernen hilft Computern, aus Daten zu lernen.

4

Sprachsynthese wandelt Text in klare Sprache um.

5

Die Schüler lasen am Morgen eine kurze Geschichte.

6

Der Zug hatte wegen Wartungsarbeiten Verspätung.

7

Kleine Modelle laufen schnell auf lokalen Geräten.

8

Ein Sprachassistent hilft bei alltäglichen Aufgaben.

9

Stabiles Vorlesen ist für kurze und lange Texte wichtig.

Speaker 3

0

Hallo Welt.

1

Wie geht es dir heute?

2

Der Himmel ist blau und der Wind ist mild.

3

Maschinelles Lernen hilft Computern, aus Daten zu lernen.

4

Sprachsynthese wandelt Text in klare Sprache um.

5

Die Schüler lasen am Morgen eine kurze Geschichte.

6

Der Zug hatte wegen Wartungsarbeiten Verspätung.

7

Kleine Modelle laufen schnell auf lokalen Geräten.

8

Ein Sprachassistent hilft bei alltäglichen Aufgaben.

9

Stabiles Vorlesen ist für kurze und lange Texte wichtig.

Speaker 4

0

Hallo Welt.

1

Wie geht es dir heute?

2

Der Himmel ist blau und der Wind ist mild.

3

Maschinelles Lernen hilft Computern, aus Daten zu lernen.

4

Sprachsynthese wandelt Text in klare Sprache um.

5

Die Schüler lasen am Morgen eine kurze Geschichte.

6

Der Zug hatte wegen Wartungsarbeiten Verspätung.

7

Kleine Modelle laufen schnell auf lokalen Geräten.

8

Ein Sprachassistent hilft bei alltäglichen Aufgaben.

9

Stabiles Vorlesen ist für kurze und lange Texte wichtig.

Speaker 5

0

Hallo Welt.

1

Wie geht es dir heute?

2

Der Himmel ist blau und der Wind ist mild.

3

Maschinelles Lernen hilft Computern, aus Daten zu lernen.

4

Sprachsynthese wandelt Text in klare Sprache um.

5

Die Schüler lasen am Morgen eine kurze Geschichte.

6

Der Zug hatte wegen Wartungsarbeiten Verspätung.

7

Kleine Modelle laufen schnell auf lokalen Geräten.

8

Ein Sprachassistent hilft bei alltäglichen Aufgaben.

9

Stabiles Vorlesen ist für kurze und lange Texte wichtig.

Speaker 6

0

Hallo Welt.

1

Wie geht es dir heute?

2

Der Himmel ist blau und der Wind ist mild.

3

Maschinelles Lernen hilft Computern, aus Daten zu lernen.

4

Sprachsynthese wandelt Text in klare Sprache um.

5

Die Schüler lasen am Morgen eine kurze Geschichte.

6

Der Zug hatte wegen Wartungsarbeiten Verspätung.

7

Kleine Modelle laufen schnell auf lokalen Geräten.

8

Ein Sprachassistent hilft bei alltäglichen Aufgaben.

9

Stabiles Vorlesen ist für kurze und lange Texte wichtig.

Speaker 7

0

Hallo Welt.

1

Wie geht es dir heute?

2

Der Himmel ist blau und der Wind ist mild.

3

Maschinelles Lernen hilft Computern, aus Daten zu lernen.

4

Sprachsynthese wandelt Text in klare Sprache um.

5

Die Schüler lasen am Morgen eine kurze Geschichte.

6

Der Zug hatte wegen Wartungsarbeiten Verspätung.

7

Kleine Modelle laufen schnell auf lokalen Geräten.

8

Ein Sprachassistent hilft bei alltäglichen Aufgaben.

9

Stabiles Vorlesen ist für kurze und lange Texte wichtig.

Speaker 8

0

Hallo Welt.

1

Wie geht es dir heute?

2

Der Himmel ist blau und der Wind ist mild.

3

Maschinelles Lernen hilft Computern, aus Daten zu lernen.

4

Sprachsynthese wandelt Text in klare Sprache um.

5

Die Schüler lasen am Morgen eine kurze Geschichte.

6

Der Zug hatte wegen Wartungsarbeiten Verspätung.

7

Kleine Modelle laufen schnell auf lokalen Geräten.

8

Ein Sprachassistent hilft bei alltäglichen Aufgaben.

9

Stabiles Vorlesen ist für kurze und lange Texte wichtig.

Speaker 9

0

Hallo Welt.

1

Wie geht es dir heute?

2

Der Himmel ist blau und der Wind ist mild.

3

Maschinelles Lernen hilft Computern, aus Daten zu lernen.

4

Sprachsynthese wandelt Text in klare Sprache um.

5

Die Schüler lasen am Morgen eine kurze Geschichte.

6

Der Zug hatte wegen Wartungsarbeiten Verspätung.

7

Kleine Modelle laufen schnell auf lokalen Geräten.

8

Ein Sprachassistent hilft bei alltäglichen Aufgaben.

9

Stabiles Vorlesen ist für kurze und lange Texte wichtig.

Greek

This section lists text to speech models for Greek.

vits-piper-el_GR-rapunzelina-low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/el/el_GR/rapunzelina/low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-el_GR-rapunzelina-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-el_GR-rapunzelina-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx";
  config.model.vits.tokens = "vits-piper-el_GR-rapunzelina-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-el_GR-rapunzelina-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-el_GR-rapunzelina-low.tar.bz2

You can use the following code to play with vits-piper-el_GR-rapunzelina-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx",
            data_dir="vits-piper-el_GR-rapunzelina-low/espeak-ng-data",
            tokens="vits-piper-el_GR-rapunzelina-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-el_GR-rapunzelina-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx";
  config.model.vits.tokens = "vits-piper-el_GR-rapunzelina-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-el_GR-rapunzelina-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-el_GR-rapunzelina-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx".into()),
                tokens: Some("vits-piper-el_GR-rapunzelina-low/tokens.txt".into()),
                data_dir: Some("vits-piper-el_GR-rapunzelina-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-el_GR-rapunzelina-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx',
        tokens: 'vits-piper-el_GR-rapunzelina-low/tokens.txt',
        dataDir: 'vits-piper-el_GR-rapunzelina-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-el_GR-rapunzelina-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx',
    tokens: 'vits-piper-el_GR-rapunzelina-low/tokens.txt',
    dataDir: 'vits-piper-el_GR-rapunzelina-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-el_GR-rapunzelina-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx",
    lexicon: "",
    tokens: "vits-piper-el_GR-rapunzelina-low/tokens.txt",
    dataDir: "vits-piper-el_GR-rapunzelina-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-el_GR-rapunzelina-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx";
config.Model.Vits.Tokens = "vits-piper-el_GR-rapunzelina-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-el_GR-rapunzelina-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-el_GR-rapunzelina-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx",
        tokens = "vits-piper-el_GR-rapunzelina-low/tokens.txt",
        dataDir = "vits-piper-el_GR-rapunzelina-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-el_GR-rapunzelina-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx");
    vits.setTokens("vits-piper-el_GR-rapunzelina-low/tokens.txt");
    vits.setDataDir("vits-piper-el_GR-rapunzelina-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-el_GR-rapunzelina-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-el_GR-rapunzelina-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-el_GR-rapunzelina-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-el_GR-rapunzelina-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx",
				Tokens:  "vits-piper-el_GR-rapunzelina-low/tokens.txt",
				DataDir: "vits-piper-el_GR-rapunzelina-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.

sample audios for different speakers are listed below:

Speaker 0

supertonic-3-el

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Greek (el).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "el"

audio = tts.generate("Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"el\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "el"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "el"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'el'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'el'},
  );
  final audio = tts.generateWithConfig(text: 'Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "el"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"el\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "el"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"el\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "el"}';

  Audio := Tts.GenerateWithConfig('Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "el"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Γεια σου κόσμε.

1

Πώς είσαι σήμερα;

2

Ο ουρανός είναι γαλάζιος και ο άνεμος ήρεμος.

3

Η μηχανική μάθηση βοηθά τους υπολογιστές να μαθαίνουν από δεδομένα.

4

Η σύνθεση ομιλίας μετατρέπει το κείμενο σε καθαρό ήχο.

5

Οι μαθητές διάβασαν μια μικρή ιστορία στη βιβλιοθήκη.

6

Το τρένο καθυστέρησε λόγω εργασιών συντήρησης.

7

Τα μικρά μοντέλα λειτουργούν γρήγορα σε τοπικές συσκευές.

8

Ο φωνητικός βοηθός διευκολύνει τις καθημερινές εργασίες.

9

Η σταθερή ανάγνωση είναι σημαντική για σύντομα και μεγάλα κείμενα.

Speaker 1

0

Γεια σου κόσμε.

1

Πώς είσαι σήμερα;

2

Ο ουρανός είναι γαλάζιος και ο άνεμος ήρεμος.

3

Η μηχανική μάθηση βοηθά τους υπολογιστές να μαθαίνουν από δεδομένα.

4

Η σύνθεση ομιλίας μετατρέπει το κείμενο σε καθαρό ήχο.

5

Οι μαθητές διάβασαν μια μικρή ιστορία στη βιβλιοθήκη.

6

Το τρένο καθυστέρησε λόγω εργασιών συντήρησης.

7

Τα μικρά μοντέλα λειτουργούν γρήγορα σε τοπικές συσκευές.

8

Ο φωνητικός βοηθός διευκολύνει τις καθημερινές εργασίες.

9

Η σταθερή ανάγνωση είναι σημαντική για σύντομα και μεγάλα κείμενα.

Speaker 2

0

Γεια σου κόσμε.

1

Πώς είσαι σήμερα;

2

Ο ουρανός είναι γαλάζιος και ο άνεμος ήρεμος.

3

Η μηχανική μάθηση βοηθά τους υπολογιστές να μαθαίνουν από δεδομένα.

4

Η σύνθεση ομιλίας μετατρέπει το κείμενο σε καθαρό ήχο.

5

Οι μαθητές διάβασαν μια μικρή ιστορία στη βιβλιοθήκη.

6

Το τρένο καθυστέρησε λόγω εργασιών συντήρησης.

7

Τα μικρά μοντέλα λειτουργούν γρήγορα σε τοπικές συσκευές.

8

Ο φωνητικός βοηθός διευκολύνει τις καθημερινές εργασίες.

9

Η σταθερή ανάγνωση είναι σημαντική για σύντομα και μεγάλα κείμενα.

Speaker 3

0

Γεια σου κόσμε.

1

Πώς είσαι σήμερα;

2

Ο ουρανός είναι γαλάζιος και ο άνεμος ήρεμος.

3

Η μηχανική μάθηση βοηθά τους υπολογιστές να μαθαίνουν από δεδομένα.

4

Η σύνθεση ομιλίας μετατρέπει το κείμενο σε καθαρό ήχο.

5

Οι μαθητές διάβασαν μια μικρή ιστορία στη βιβλιοθήκη.

6

Το τρένο καθυστέρησε λόγω εργασιών συντήρησης.

7

Τα μικρά μοντέλα λειτουργούν γρήγορα σε τοπικές συσκευές.

8

Ο φωνητικός βοηθός διευκολύνει τις καθημερινές εργασίες.

9

Η σταθερή ανάγνωση είναι σημαντική για σύντομα και μεγάλα κείμενα.

Speaker 4

0

Γεια σου κόσμε.

1

Πώς είσαι σήμερα;

2

Ο ουρανός είναι γαλάζιος και ο άνεμος ήρεμος.

3

Η μηχανική μάθηση βοηθά τους υπολογιστές να μαθαίνουν από δεδομένα.

4

Η σύνθεση ομιλίας μετατρέπει το κείμενο σε καθαρό ήχο.

5

Οι μαθητές διάβασαν μια μικρή ιστορία στη βιβλιοθήκη.

6

Το τρένο καθυστέρησε λόγω εργασιών συντήρησης.

7

Τα μικρά μοντέλα λειτουργούν γρήγορα σε τοπικές συσκευές.

8

Ο φωνητικός βοηθός διευκολύνει τις καθημερινές εργασίες.

9

Η σταθερή ανάγνωση είναι σημαντική για σύντομα και μεγάλα κείμενα.

Speaker 5

0

Γεια σου κόσμε.

1

Πώς είσαι σήμερα;

2

Ο ουρανός είναι γαλάζιος και ο άνεμος ήρεμος.

3

Η μηχανική μάθηση βοηθά τους υπολογιστές να μαθαίνουν από δεδομένα.

4

Η σύνθεση ομιλίας μετατρέπει το κείμενο σε καθαρό ήχο.

5

Οι μαθητές διάβασαν μια μικρή ιστορία στη βιβλιοθήκη.

6

Το τρένο καθυστέρησε λόγω εργασιών συντήρησης.

7

Τα μικρά μοντέλα λειτουργούν γρήγορα σε τοπικές συσκευές.

8

Ο φωνητικός βοηθός διευκολύνει τις καθημερινές εργασίες.

9

Η σταθερή ανάγνωση είναι σημαντική για σύντομα και μεγάλα κείμενα.

Speaker 6

0

Γεια σου κόσμε.

1

Πώς είσαι σήμερα;

2

Ο ουρανός είναι γαλάζιος και ο άνεμος ήρεμος.

3

Η μηχανική μάθηση βοηθά τους υπολογιστές να μαθαίνουν από δεδομένα.

4

Η σύνθεση ομιλίας μετατρέπει το κείμενο σε καθαρό ήχο.

5

Οι μαθητές διάβασαν μια μικρή ιστορία στη βιβλιοθήκη.

6

Το τρένο καθυστέρησε λόγω εργασιών συντήρησης.

7

Τα μικρά μοντέλα λειτουργούν γρήγορα σε τοπικές συσκευές.

8

Ο φωνητικός βοηθός διευκολύνει τις καθημερινές εργασίες.

9

Η σταθερή ανάγνωση είναι σημαντική για σύντομα και μεγάλα κείμενα.

Speaker 7

0

Γεια σου κόσμε.

1

Πώς είσαι σήμερα;

2

Ο ουρανός είναι γαλάζιος και ο άνεμος ήρεμος.

3

Η μηχανική μάθηση βοηθά τους υπολογιστές να μαθαίνουν από δεδομένα.

4

Η σύνθεση ομιλίας μετατρέπει το κείμενο σε καθαρό ήχο.

5

Οι μαθητές διάβασαν μια μικρή ιστορία στη βιβλιοθήκη.

6

Το τρένο καθυστέρησε λόγω εργασιών συντήρησης.

7

Τα μικρά μοντέλα λειτουργούν γρήγορα σε τοπικές συσκευές.

8

Ο φωνητικός βοηθός διευκολύνει τις καθημερινές εργασίες.

9

Η σταθερή ανάγνωση είναι σημαντική για σύντομα και μεγάλα κείμενα.

Speaker 8

0

Γεια σου κόσμε.

1

Πώς είσαι σήμερα;

2

Ο ουρανός είναι γαλάζιος και ο άνεμος ήρεμος.

3

Η μηχανική μάθηση βοηθά τους υπολογιστές να μαθαίνουν από δεδομένα.

4

Η σύνθεση ομιλίας μετατρέπει το κείμενο σε καθαρό ήχο.

5

Οι μαθητές διάβασαν μια μικρή ιστορία στη βιβλιοθήκη.

6

Το τρένο καθυστέρησε λόγω εργασιών συντήρησης.

7

Τα μικρά μοντέλα λειτουργούν γρήγορα σε τοπικές συσκευές.

8

Ο φωνητικός βοηθός διευκολύνει τις καθημερινές εργασίες.

9

Η σταθερή ανάγνωση είναι σημαντική για σύντομα και μεγάλα κείμενα.

Speaker 9

0

Γεια σου κόσμε.

1

Πώς είσαι σήμερα;

2

Ο ουρανός είναι γαλάζιος και ο άνεμος ήρεμος.

3

Η μηχανική μάθηση βοηθά τους υπολογιστές να μαθαίνουν από δεδομένα.

4

Η σύνθεση ομιλίας μετατρέπει το κείμενο σε καθαρό ήχο.

5

Οι μαθητές διάβασαν μια μικρή ιστορία στη βιβλιοθήκη.

6

Το τρένο καθυστέρησε λόγω εργασιών συντήρησης.

7

Τα μικρά μοντέλα λειτουργούν γρήγορα σε τοπικές συσκευές.

8

Ο φωνητικός βοηθός διευκολύνει τις καθημερινές εργασίες.

9

Η σταθερή ανάγνωση είναι σημαντική για σύντομα και μεγάλα κείμενα.

Hindi

This section lists text to speech models for Hindi.

vits-piper-hi_IN-pratham-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/hi/hi_IN/pratham/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-hi_IN-pratham-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-hi_IN-pratham-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx";
  config.model.vits.tokens = "vits-piper-hi_IN-pratham-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-hi_IN-pratham-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-hi_IN-pratham-medium.tar.bz2

You can use the following code to play with vits-piper-hi_IN-pratham-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx",
            data_dir="vits-piper-hi_IN-pratham-medium/espeak-ng-data",
            tokens="vits-piper-hi_IN-pratham-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-hi_IN-pratham-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx";
  config.model.vits.tokens = "vits-piper-hi_IN-pratham-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-hi_IN-pratham-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-hi_IN-pratham-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx".into()),
                tokens: Some("vits-piper-hi_IN-pratham-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-hi_IN-pratham-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-hi_IN-pratham-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx',
        tokens: 'vits-piper-hi_IN-pratham-medium/tokens.txt',
        dataDir: 'vits-piper-hi_IN-pratham-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-hi_IN-pratham-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx',
    tokens: 'vits-piper-hi_IN-pratham-medium/tokens.txt',
    dataDir: 'vits-piper-hi_IN-pratham-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-hi_IN-pratham-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-hi_IN-pratham-medium/tokens.txt",
    dataDir: "vits-piper-hi_IN-pratham-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-hi_IN-pratham-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-hi_IN-pratham-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-hi_IN-pratham-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-hi_IN-pratham-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx",
        tokens = "vits-piper-hi_IN-pratham-medium/tokens.txt",
        dataDir = "vits-piper-hi_IN-pratham-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-hi_IN-pratham-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx");
    vits.setTokens("vits-piper-hi_IN-pratham-medium/tokens.txt");
    vits.setDataDir("vits-piper-hi_IN-pratham-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-hi_IN-pratham-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-hi_IN-pratham-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-hi_IN-pratham-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-hi_IN-pratham-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx",
				Tokens:  "vits-piper-hi_IN-pratham-medium/tokens.txt",
				DataDir: "vits-piper-hi_IN-pratham-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।

sample audios for different speakers are listed below:

Speaker 0

vits-piper-hi_IN-priyamvada-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/hi/hi_IN/priyamvada/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-hi_IN-priyamvada-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx";
  config.model.vits.tokens = "vits-piper-hi_IN-priyamvada-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-hi_IN-priyamvada-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-hi_IN-priyamvada-medium.tar.bz2

You can use the following code to play with vits-piper-hi_IN-priyamvada-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx",
            data_dir="vits-piper-hi_IN-priyamvada-medium/espeak-ng-data",
            tokens="vits-piper-hi_IN-priyamvada-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx";
  config.model.vits.tokens = "vits-piper-hi_IN-priyamvada-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-hi_IN-priyamvada-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx".into()),
                tokens: Some("vits-piper-hi_IN-priyamvada-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-hi_IN-priyamvada-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx',
        tokens: 'vits-piper-hi_IN-priyamvada-medium/tokens.txt',
        dataDir: 'vits-piper-hi_IN-priyamvada-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx',
    tokens: 'vits-piper-hi_IN-priyamvada-medium/tokens.txt',
    dataDir: 'vits-piper-hi_IN-priyamvada-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-hi_IN-priyamvada-medium/tokens.txt",
    dataDir: "vits-piper-hi_IN-priyamvada-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-hi_IN-priyamvada-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-hi_IN-priyamvada-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx",
        tokens = "vits-piper-hi_IN-priyamvada-medium/tokens.txt",
        dataDir = "vits-piper-hi_IN-priyamvada-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx");
    vits.setTokens("vits-piper-hi_IN-priyamvada-medium/tokens.txt");
    vits.setDataDir("vits-piper-hi_IN-priyamvada-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-hi_IN-priyamvada-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-hi_IN-priyamvada-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx",
				Tokens:  "vits-piper-hi_IN-priyamvada-medium/tokens.txt",
				DataDir: "vits-piper-hi_IN-priyamvada-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।

sample audios for different speakers are listed below:

Speaker 0

vits-piper-hi_IN-rohan-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/hi/hi_IN/rohan/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-hi_IN-rohan-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-hi_IN-rohan-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx";
  config.model.vits.tokens = "vits-piper-hi_IN-rohan-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-hi_IN-rohan-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-hi_IN-rohan-medium.tar.bz2

You can use the following code to play with vits-piper-hi_IN-rohan-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx",
            data_dir="vits-piper-hi_IN-rohan-medium/espeak-ng-data",
            tokens="vits-piper-hi_IN-rohan-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-hi_IN-rohan-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx";
  config.model.vits.tokens = "vits-piper-hi_IN-rohan-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-hi_IN-rohan-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-hi_IN-rohan-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx".into()),
                tokens: Some("vits-piper-hi_IN-rohan-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-hi_IN-rohan-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-hi_IN-rohan-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx',
        tokens: 'vits-piper-hi_IN-rohan-medium/tokens.txt',
        dataDir: 'vits-piper-hi_IN-rohan-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-hi_IN-rohan-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx',
    tokens: 'vits-piper-hi_IN-rohan-medium/tokens.txt',
    dataDir: 'vits-piper-hi_IN-rohan-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-hi_IN-rohan-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-hi_IN-rohan-medium/tokens.txt",
    dataDir: "vits-piper-hi_IN-rohan-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-hi_IN-rohan-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-hi_IN-rohan-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-hi_IN-rohan-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-hi_IN-rohan-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx",
        tokens = "vits-piper-hi_IN-rohan-medium/tokens.txt",
        dataDir = "vits-piper-hi_IN-rohan-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-hi_IN-rohan-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx");
    vits.setTokens("vits-piper-hi_IN-rohan-medium/tokens.txt");
    vits.setDataDir("vits-piper-hi_IN-rohan-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-hi_IN-rohan-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-hi_IN-rohan-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-hi_IN-rohan-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-hi_IN-rohan-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx",
				Tokens:  "vits-piper-hi_IN-rohan-medium/tokens.txt",
				DataDir: "vits-piper-hi_IN-rohan-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।

sample audios for different speakers are listed below:

Speaker 0

supertonic-3-hi

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Hindi (hi).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "hi"

audio = tts.generate("यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"hi\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "hi"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "hi"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'hi'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'hi'},
  );
  final audio = tts.generateWithConfig(text: 'यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "hi"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"hi\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "hi"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"hi\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "hi"}';

  Audio := Tts.GenerateWithConfig('यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "hi"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

नमस्ते दुनिया.

1

आज आप कैसे हैं?

2

आसमान नीला है और हवा हल्की है.

3

मशीन लर्निंग कंप्यूटरों को डेटा से सीखने में मदद करती है.

4

वाक् संश्लेषण पाठ को स्पष्ट ध्वनि में बदलता है.

5

छात्रों ने पुस्तकालय में एक छोटी कहानी पढ़ी.

6

पटरियों की मरम्मत के कारण ट्रेन थोड़ी देर से आई.

7

छोटे मॉडल स्थानीय उपकरणों पर तेज़ी से चलते हैं.

8

वॉयस असिस्टेंट रोज़मर्रा के कामों में मदद करता है.

9

लंबे और छोटे वाक्यों के लिए स्थिर पढ़ना महत्वपूर्ण है.

Speaker 1

0

नमस्ते दुनिया.

1

आज आप कैसे हैं?

2

आसमान नीला है और हवा हल्की है.

3

मशीन लर्निंग कंप्यूटरों को डेटा से सीखने में मदद करती है.

4

वाक् संश्लेषण पाठ को स्पष्ट ध्वनि में बदलता है.

5

छात्रों ने पुस्तकालय में एक छोटी कहानी पढ़ी.

6

पटरियों की मरम्मत के कारण ट्रेन थोड़ी देर से आई.

7

छोटे मॉडल स्थानीय उपकरणों पर तेज़ी से चलते हैं.

8

वॉयस असिस्टेंट रोज़मर्रा के कामों में मदद करता है.

9

लंबे और छोटे वाक्यों के लिए स्थिर पढ़ना महत्वपूर्ण है.

Speaker 2

0

नमस्ते दुनिया.

1

आज आप कैसे हैं?

2

आसमान नीला है और हवा हल्की है.

3

मशीन लर्निंग कंप्यूटरों को डेटा से सीखने में मदद करती है.

4

वाक् संश्लेषण पाठ को स्पष्ट ध्वनि में बदलता है.

5

छात्रों ने पुस्तकालय में एक छोटी कहानी पढ़ी.

6

पटरियों की मरम्मत के कारण ट्रेन थोड़ी देर से आई.

7

छोटे मॉडल स्थानीय उपकरणों पर तेज़ी से चलते हैं.

8

वॉयस असिस्टेंट रोज़मर्रा के कामों में मदद करता है.

9

लंबे और छोटे वाक्यों के लिए स्थिर पढ़ना महत्वपूर्ण है.

Speaker 3

0

नमस्ते दुनिया.

1

आज आप कैसे हैं?

2

आसमान नीला है और हवा हल्की है.

3

मशीन लर्निंग कंप्यूटरों को डेटा से सीखने में मदद करती है.

4

वाक् संश्लेषण पाठ को स्पष्ट ध्वनि में बदलता है.

5

छात्रों ने पुस्तकालय में एक छोटी कहानी पढ़ी.

6

पटरियों की मरम्मत के कारण ट्रेन थोड़ी देर से आई.

7

छोटे मॉडल स्थानीय उपकरणों पर तेज़ी से चलते हैं.

8

वॉयस असिस्टेंट रोज़मर्रा के कामों में मदद करता है.

9

लंबे और छोटे वाक्यों के लिए स्थिर पढ़ना महत्वपूर्ण है.

Speaker 4

0

नमस्ते दुनिया.

1

आज आप कैसे हैं?

2

आसमान नीला है और हवा हल्की है.

3

मशीन लर्निंग कंप्यूटरों को डेटा से सीखने में मदद करती है.

4

वाक् संश्लेषण पाठ को स्पष्ट ध्वनि में बदलता है.

5

छात्रों ने पुस्तकालय में एक छोटी कहानी पढ़ी.

6

पटरियों की मरम्मत के कारण ट्रेन थोड़ी देर से आई.

7

छोटे मॉडल स्थानीय उपकरणों पर तेज़ी से चलते हैं.

8

वॉयस असिस्टेंट रोज़मर्रा के कामों में मदद करता है.

9

लंबे और छोटे वाक्यों के लिए स्थिर पढ़ना महत्वपूर्ण है.

Speaker 5

0

नमस्ते दुनिया.

1

आज आप कैसे हैं?

2

आसमान नीला है और हवा हल्की है.

3

मशीन लर्निंग कंप्यूटरों को डेटा से सीखने में मदद करती है.

4

वाक् संश्लेषण पाठ को स्पष्ट ध्वनि में बदलता है.

5

छात्रों ने पुस्तकालय में एक छोटी कहानी पढ़ी.

6

पटरियों की मरम्मत के कारण ट्रेन थोड़ी देर से आई.

7

छोटे मॉडल स्थानीय उपकरणों पर तेज़ी से चलते हैं.

8

वॉयस असिस्टेंट रोज़मर्रा के कामों में मदद करता है.

9

लंबे और छोटे वाक्यों के लिए स्थिर पढ़ना महत्वपूर्ण है.

Speaker 6

0

नमस्ते दुनिया.

1

आज आप कैसे हैं?

2

आसमान नीला है और हवा हल्की है.

3

मशीन लर्निंग कंप्यूटरों को डेटा से सीखने में मदद करती है.

4

वाक् संश्लेषण पाठ को स्पष्ट ध्वनि में बदलता है.

5

छात्रों ने पुस्तकालय में एक छोटी कहानी पढ़ी.

6

पटरियों की मरम्मत के कारण ट्रेन थोड़ी देर से आई.

7

छोटे मॉडल स्थानीय उपकरणों पर तेज़ी से चलते हैं.

8

वॉयस असिस्टेंट रोज़मर्रा के कामों में मदद करता है.

9

लंबे और छोटे वाक्यों के लिए स्थिर पढ़ना महत्वपूर्ण है.

Speaker 7

0

नमस्ते दुनिया.

1

आज आप कैसे हैं?

2

आसमान नीला है और हवा हल्की है.

3

मशीन लर्निंग कंप्यूटरों को डेटा से सीखने में मदद करती है.

4

वाक् संश्लेषण पाठ को स्पष्ट ध्वनि में बदलता है.

5

छात्रों ने पुस्तकालय में एक छोटी कहानी पढ़ी.

6

पटरियों की मरम्मत के कारण ट्रेन थोड़ी देर से आई.

7

छोटे मॉडल स्थानीय उपकरणों पर तेज़ी से चलते हैं.

8

वॉयस असिस्टेंट रोज़मर्रा के कामों में मदद करता है.

9

लंबे और छोटे वाक्यों के लिए स्थिर पढ़ना महत्वपूर्ण है.

Speaker 8

0

नमस्ते दुनिया.

1

आज आप कैसे हैं?

2

आसमान नीला है और हवा हल्की है.

3

मशीन लर्निंग कंप्यूटरों को डेटा से सीखने में मदद करती है.

4

वाक् संश्लेषण पाठ को स्पष्ट ध्वनि में बदलता है.

5

छात्रों ने पुस्तकालय में एक छोटी कहानी पढ़ी.

6

पटरियों की मरम्मत के कारण ट्रेन थोड़ी देर से आई.

7

छोटे मॉडल स्थानीय उपकरणों पर तेज़ी से चलते हैं.

8

वॉयस असिस्टेंट रोज़मर्रा के कामों में मदद करता है.

9

लंबे और छोटे वाक्यों के लिए स्थिर पढ़ना महत्वपूर्ण है.

Speaker 9

0

नमस्ते दुनिया.

1

आज आप कैसे हैं?

2

आसमान नीला है और हवा हल्की है.

3

मशीन लर्निंग कंप्यूटरों को डेटा से सीखने में मदद करती है.

4

वाक् संश्लेषण पाठ को स्पष्ट ध्वनि में बदलता है.

5

छात्रों ने पुस्तकालय में एक छोटी कहानी पढ़ी.

6

पटरियों की मरम्मत के कारण ट्रेन थोड़ी देर से आई.

7

छोटे मॉडल स्थानीय उपकरणों पर तेज़ी से चलते हैं.

8

वॉयस असिस्टेंट रोज़मर्रा के कामों में मदद करता है.

9

लंबे और छोटे वाक्यों के लिए स्थिर पढ़ना महत्वपूर्ण है.

Hungarian

This section lists text to speech models for Hungarian.

vits-piper-hu_HU-anna-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/hu/hu_HU/anna/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-hu_HU-anna-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-hu_HU-anna-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx";
  config.model.vits.tokens = "vits-piper-hu_HU-anna-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-hu_HU-anna-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-hu_HU-anna-medium.tar.bz2

You can use the following code to play with vits-piper-hu_HU-anna-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx",
            data_dir="vits-piper-hu_HU-anna-medium/espeak-ng-data",
            tokens="vits-piper-hu_HU-anna-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Ha északról fúj a szél, a lányok nem lógnak együtt.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-hu_HU-anna-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx";
  config.model.vits.tokens = "vits-piper-hu_HU-anna-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-hu_HU-anna-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-hu_HU-anna-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx".into()),
                tokens: Some("vits-piper-hu_HU-anna-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-hu_HU-anna-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-hu_HU-anna-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx',
        tokens: 'vits-piper-hu_HU-anna-medium/tokens.txt',
        dataDir: 'vits-piper-hu_HU-anna-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Ha északról fúj a szél, a lányok nem lógnak együtt.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-hu_HU-anna-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx',
    tokens: 'vits-piper-hu_HU-anna-medium/tokens.txt',
    dataDir: 'vits-piper-hu_HU-anna-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Ha északról fúj a szél, a lányok nem lógnak együtt.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-hu_HU-anna-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-hu_HU-anna-medium/tokens.txt",
    dataDir: "vits-piper-hu_HU-anna-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Ha északról fúj a szél, a lányok nem lógnak együtt."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-hu_HU-anna-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-hu_HU-anna-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-hu_HU-anna-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-hu_HU-anna-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx",
        tokens = "vits-piper-hu_HU-anna-medium/tokens.txt",
        dataDir = "vits-piper-hu_HU-anna-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Ha északról fúj a szél, a lányok nem lógnak együtt.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-hu_HU-anna-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx");
    vits.setTokens("vits-piper-hu_HU-anna-medium/tokens.txt");
    vits.setDataDir("vits-piper-hu_HU-anna-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-hu_HU-anna-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-hu_HU-anna-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-hu_HU-anna-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Ha északról fúj a szél, a lányok nem lógnak együtt.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-hu_HU-anna-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx",
				Tokens:  "vits-piper-hu_HU-anna-medium/tokens.txt",
				DataDir: "vits-piper-hu_HU-anna-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Ha északról fúj a szél, a lányok nem lógnak együtt."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Ha északról fúj a szél, a lányok nem lógnak együtt.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-hu_HU-berta-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/hu/hu_HU/berta/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-hu_HU-berta-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-hu_HU-berta-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx";
  config.model.vits.tokens = "vits-piper-hu_HU-berta-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-hu_HU-berta-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-hu_HU-berta-medium.tar.bz2

You can use the following code to play with vits-piper-hu_HU-berta-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx",
            data_dir="vits-piper-hu_HU-berta-medium/espeak-ng-data",
            tokens="vits-piper-hu_HU-berta-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Ha északról fúj a szél, a lányok nem lógnak együtt.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-hu_HU-berta-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx";
  config.model.vits.tokens = "vits-piper-hu_HU-berta-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-hu_HU-berta-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-hu_HU-berta-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx".into()),
                tokens: Some("vits-piper-hu_HU-berta-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-hu_HU-berta-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-hu_HU-berta-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx',
        tokens: 'vits-piper-hu_HU-berta-medium/tokens.txt',
        dataDir: 'vits-piper-hu_HU-berta-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Ha északról fúj a szél, a lányok nem lógnak együtt.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-hu_HU-berta-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx',
    tokens: 'vits-piper-hu_HU-berta-medium/tokens.txt',
    dataDir: 'vits-piper-hu_HU-berta-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Ha északról fúj a szél, a lányok nem lógnak együtt.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-hu_HU-berta-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-hu_HU-berta-medium/tokens.txt",
    dataDir: "vits-piper-hu_HU-berta-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Ha északról fúj a szél, a lányok nem lógnak együtt."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-hu_HU-berta-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-hu_HU-berta-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-hu_HU-berta-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-hu_HU-berta-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx",
        tokens = "vits-piper-hu_HU-berta-medium/tokens.txt",
        dataDir = "vits-piper-hu_HU-berta-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Ha északról fúj a szél, a lányok nem lógnak együtt.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-hu_HU-berta-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx");
    vits.setTokens("vits-piper-hu_HU-berta-medium/tokens.txt");
    vits.setDataDir("vits-piper-hu_HU-berta-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-hu_HU-berta-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-hu_HU-berta-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-hu_HU-berta-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Ha északról fúj a szél, a lányok nem lógnak együtt.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-hu_HU-berta-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx",
				Tokens:  "vits-piper-hu_HU-berta-medium/tokens.txt",
				DataDir: "vits-piper-hu_HU-berta-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Ha északról fúj a szél, a lányok nem lógnak együtt."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Ha északról fúj a szél, a lányok nem lógnak együtt.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-hu_HU-imre-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/hu/hu_HU/imre/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-hu_HU-imre-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-hu_HU-imre-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx";
  config.model.vits.tokens = "vits-piper-hu_HU-imre-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-hu_HU-imre-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-hu_HU-imre-medium.tar.bz2

You can use the following code to play with vits-piper-hu_HU-imre-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx",
            data_dir="vits-piper-hu_HU-imre-medium/espeak-ng-data",
            tokens="vits-piper-hu_HU-imre-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Ha északról fúj a szél, a lányok nem lógnak együtt.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-hu_HU-imre-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx";
  config.model.vits.tokens = "vits-piper-hu_HU-imre-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-hu_HU-imre-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-hu_HU-imre-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx".into()),
                tokens: Some("vits-piper-hu_HU-imre-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-hu_HU-imre-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-hu_HU-imre-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx',
        tokens: 'vits-piper-hu_HU-imre-medium/tokens.txt',
        dataDir: 'vits-piper-hu_HU-imre-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Ha északról fúj a szél, a lányok nem lógnak együtt.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-hu_HU-imre-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx',
    tokens: 'vits-piper-hu_HU-imre-medium/tokens.txt',
    dataDir: 'vits-piper-hu_HU-imre-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Ha északról fúj a szél, a lányok nem lógnak együtt.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-hu_HU-imre-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-hu_HU-imre-medium/tokens.txt",
    dataDir: "vits-piper-hu_HU-imre-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Ha északról fúj a szél, a lányok nem lógnak együtt."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-hu_HU-imre-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-hu_HU-imre-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-hu_HU-imre-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-hu_HU-imre-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx",
        tokens = "vits-piper-hu_HU-imre-medium/tokens.txt",
        dataDir = "vits-piper-hu_HU-imre-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Ha északról fúj a szél, a lányok nem lógnak együtt.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-hu_HU-imre-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx");
    vits.setTokens("vits-piper-hu_HU-imre-medium/tokens.txt");
    vits.setDataDir("vits-piper-hu_HU-imre-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-hu_HU-imre-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-hu_HU-imre-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-hu_HU-imre-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Ha északról fúj a szél, a lányok nem lógnak együtt.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-hu_HU-imre-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx",
				Tokens:  "vits-piper-hu_HU-imre-medium/tokens.txt",
				DataDir: "vits-piper-hu_HU-imre-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Ha északról fúj a szél, a lányok nem lógnak együtt."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Ha északról fúj a szél, a lányok nem lógnak együtt.

sample audios for different speakers are listed below:

Speaker 0

supertonic-3-hu

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Hungarian (hu).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "hu"

audio = tts.generate("Ez egy szövegfelolvasó motor a következő generációs kaldi használatával", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Ez egy szövegfelolvasó motor a következő generációs kaldi használatával";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"hu\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Ez egy szövegfelolvasó motor a következő generációs kaldi használatával";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "hu"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Ez egy szövegfelolvasó motor a következő generációs kaldi használatával";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "hu"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Ez egy szövegfelolvasó motor a következő generációs kaldi használatával';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'hu'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'hu'},
  );
  final audio = tts.generateWithConfig(text: 'Ez egy szövegfelolvasó motor a következő generációs kaldi használatával', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Ez egy szövegfelolvasó motor a következő generációs kaldi használatával"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "hu"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Ez egy szövegfelolvasó motor a következő generációs kaldi használatával";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"hu\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "hu"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Ez egy szövegfelolvasó motor a következő generációs kaldi használatával",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Ez egy szövegfelolvasó motor a következő generációs kaldi használatával";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"hu\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "hu"}';

  Audio := Tts.GenerateWithConfig('Ez egy szövegfelolvasó motor a következő generációs kaldi használatával', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Ez egy szövegfelolvasó motor a következő generációs kaldi használatával"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "hu"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Helló világ.

1

Hogy vagy ma?

2

Az ég kék, a szél pedig enyhe.

3

A gépi tanulás segít a számítógépeknek adatokból tanulni.

4

A beszédszintézis a szöveget tiszta hanggá alakítja.

5

A diákok rövid történetet olvastak a könyvtárban.

6

A vonat a pálya karbantartása miatt késett.

7

A kis modellek gyorsan futnak helyi eszközökön.

8

A hangasszisztens segít a mindennapi feladatokban.

9

A stabil felolvasás fontos rövid és hosszú mondatoknál is.

Speaker 1

0

Helló világ.

1

Hogy vagy ma?

2

Az ég kék, a szél pedig enyhe.

3

A gépi tanulás segít a számítógépeknek adatokból tanulni.

4

A beszédszintézis a szöveget tiszta hanggá alakítja.

5

A diákok rövid történetet olvastak a könyvtárban.

6

A vonat a pálya karbantartása miatt késett.

7

A kis modellek gyorsan futnak helyi eszközökön.

8

A hangasszisztens segít a mindennapi feladatokban.

9

A stabil felolvasás fontos rövid és hosszú mondatoknál is.

Speaker 2

0

Helló világ.

1

Hogy vagy ma?

2

Az ég kék, a szél pedig enyhe.

3

A gépi tanulás segít a számítógépeknek adatokból tanulni.

4

A beszédszintézis a szöveget tiszta hanggá alakítja.

5

A diákok rövid történetet olvastak a könyvtárban.

6

A vonat a pálya karbantartása miatt késett.

7

A kis modellek gyorsan futnak helyi eszközökön.

8

A hangasszisztens segít a mindennapi feladatokban.

9

A stabil felolvasás fontos rövid és hosszú mondatoknál is.

Speaker 3

0

Helló világ.

1

Hogy vagy ma?

2

Az ég kék, a szél pedig enyhe.

3

A gépi tanulás segít a számítógépeknek adatokból tanulni.

4

A beszédszintézis a szöveget tiszta hanggá alakítja.

5

A diákok rövid történetet olvastak a könyvtárban.

6

A vonat a pálya karbantartása miatt késett.

7

A kis modellek gyorsan futnak helyi eszközökön.

8

A hangasszisztens segít a mindennapi feladatokban.

9

A stabil felolvasás fontos rövid és hosszú mondatoknál is.

Speaker 4

0

Helló világ.

1

Hogy vagy ma?

2

Az ég kék, a szél pedig enyhe.

3

A gépi tanulás segít a számítógépeknek adatokból tanulni.

4

A beszédszintézis a szöveget tiszta hanggá alakítja.

5

A diákok rövid történetet olvastak a könyvtárban.

6

A vonat a pálya karbantartása miatt késett.

7

A kis modellek gyorsan futnak helyi eszközökön.

8

A hangasszisztens segít a mindennapi feladatokban.

9

A stabil felolvasás fontos rövid és hosszú mondatoknál is.

Speaker 5

0

Helló világ.

1

Hogy vagy ma?

2

Az ég kék, a szél pedig enyhe.

3

A gépi tanulás segít a számítógépeknek adatokból tanulni.

4

A beszédszintézis a szöveget tiszta hanggá alakítja.

5

A diákok rövid történetet olvastak a könyvtárban.

6

A vonat a pálya karbantartása miatt késett.

7

A kis modellek gyorsan futnak helyi eszközökön.

8

A hangasszisztens segít a mindennapi feladatokban.

9

A stabil felolvasás fontos rövid és hosszú mondatoknál is.

Speaker 6

0

Helló világ.

1

Hogy vagy ma?

2

Az ég kék, a szél pedig enyhe.

3

A gépi tanulás segít a számítógépeknek adatokból tanulni.

4

A beszédszintézis a szöveget tiszta hanggá alakítja.

5

A diákok rövid történetet olvastak a könyvtárban.

6

A vonat a pálya karbantartása miatt késett.

7

A kis modellek gyorsan futnak helyi eszközökön.

8

A hangasszisztens segít a mindennapi feladatokban.

9

A stabil felolvasás fontos rövid és hosszú mondatoknál is.

Speaker 7

0

Helló világ.

1

Hogy vagy ma?

2

Az ég kék, a szél pedig enyhe.

3

A gépi tanulás segít a számítógépeknek adatokból tanulni.

4

A beszédszintézis a szöveget tiszta hanggá alakítja.

5

A diákok rövid történetet olvastak a könyvtárban.

6

A vonat a pálya karbantartása miatt késett.

7

A kis modellek gyorsan futnak helyi eszközökön.

8

A hangasszisztens segít a mindennapi feladatokban.

9

A stabil felolvasás fontos rövid és hosszú mondatoknál is.

Speaker 8

0

Helló világ.

1

Hogy vagy ma?

2

Az ég kék, a szél pedig enyhe.

3

A gépi tanulás segít a számítógépeknek adatokból tanulni.

4

A beszédszintézis a szöveget tiszta hanggá alakítja.

5

A diákok rövid történetet olvastak a könyvtárban.

6

A vonat a pálya karbantartása miatt késett.

7

A kis modellek gyorsan futnak helyi eszközökön.

8

A hangasszisztens segít a mindennapi feladatokban.

9

A stabil felolvasás fontos rövid és hosszú mondatoknál is.

Speaker 9

0

Helló világ.

1

Hogy vagy ma?

2

Az ég kék, a szél pedig enyhe.

3

A gépi tanulás segít a számítógépeknek adatokból tanulni.

4

A beszédszintézis a szöveget tiszta hanggá alakítja.

5

A diákok rövid történetet olvastak a könyvtárban.

6

A vonat a pálya karbantartása miatt késett.

7

A kis modellek gyorsan futnak helyi eszközökön.

8

A hangasszisztens segít a mindennapi feladatokban.

9

A stabil felolvasás fontos rövid és hosszú mondatoknál is.

Icelandic

This section lists text to speech models for Icelandic.

vits-piper-is_IS-bui-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/is/is_IS/bui/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-is_IS-bui-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-is_IS-bui-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx";
  config.model.vits.tokens = "vits-piper-is_IS-bui-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-is_IS-bui-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Farðu með allt, eða farðu ekki.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-is_IS-bui-medium.tar.bz2

You can use the following code to play with vits-piper-is_IS-bui-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx",
            data_dir="vits-piper-is_IS-bui-medium/espeak-ng-data",
            tokens="vits-piper-is_IS-bui-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Farðu með allt, eða farðu ekki.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-is_IS-bui-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx";
  config.model.vits.tokens = "vits-piper-is_IS-bui-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-is_IS-bui-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Farðu með allt, eða farðu ekki.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-is_IS-bui-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx".into()),
                tokens: Some("vits-piper-is_IS-bui-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-is_IS-bui-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Farðu með allt, eða farðu ekki.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-is_IS-bui-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx',
        tokens: 'vits-piper-is_IS-bui-medium/tokens.txt',
        dataDir: 'vits-piper-is_IS-bui-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Farðu með allt, eða farðu ekki.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-is_IS-bui-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx',
    tokens: 'vits-piper-is_IS-bui-medium/tokens.txt',
    dataDir: 'vits-piper-is_IS-bui-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Farðu með allt, eða farðu ekki.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-is_IS-bui-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-is_IS-bui-medium/tokens.txt",
    dataDir: "vits-piper-is_IS-bui-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Farðu með allt, eða farðu ekki."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-is_IS-bui-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-is_IS-bui-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-is_IS-bui-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Farðu með allt, eða farðu ekki.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-is_IS-bui-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx",
        tokens = "vits-piper-is_IS-bui-medium/tokens.txt",
        dataDir = "vits-piper-is_IS-bui-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Farðu með allt, eða farðu ekki.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-is_IS-bui-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx");
    vits.setTokens("vits-piper-is_IS-bui-medium/tokens.txt");
    vits.setDataDir("vits-piper-is_IS-bui-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Farðu með allt, eða farðu ekki.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-is_IS-bui-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-is_IS-bui-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-is_IS-bui-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Farðu með allt, eða farðu ekki.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-is_IS-bui-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx",
				Tokens:  "vits-piper-is_IS-bui-medium/tokens.txt",
				DataDir: "vits-piper-is_IS-bui-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Farðu með allt, eða farðu ekki."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Farðu með allt, eða farðu ekki.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-is_IS-salka-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/is/is_IS/salka/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-is_IS-salka-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-is_IS-salka-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx";
  config.model.vits.tokens = "vits-piper-is_IS-salka-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-is_IS-salka-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Farðu með allt, eða farðu ekki.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-is_IS-salka-medium.tar.bz2

You can use the following code to play with vits-piper-is_IS-salka-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx",
            data_dir="vits-piper-is_IS-salka-medium/espeak-ng-data",
            tokens="vits-piper-is_IS-salka-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Farðu með allt, eða farðu ekki.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-is_IS-salka-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx";
  config.model.vits.tokens = "vits-piper-is_IS-salka-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-is_IS-salka-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Farðu með allt, eða farðu ekki.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-is_IS-salka-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx".into()),
                tokens: Some("vits-piper-is_IS-salka-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-is_IS-salka-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Farðu með allt, eða farðu ekki.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-is_IS-salka-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx',
        tokens: 'vits-piper-is_IS-salka-medium/tokens.txt',
        dataDir: 'vits-piper-is_IS-salka-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Farðu með allt, eða farðu ekki.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-is_IS-salka-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx',
    tokens: 'vits-piper-is_IS-salka-medium/tokens.txt',
    dataDir: 'vits-piper-is_IS-salka-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Farðu með allt, eða farðu ekki.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-is_IS-salka-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-is_IS-salka-medium/tokens.txt",
    dataDir: "vits-piper-is_IS-salka-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Farðu með allt, eða farðu ekki."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-is_IS-salka-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-is_IS-salka-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-is_IS-salka-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Farðu með allt, eða farðu ekki.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-is_IS-salka-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx",
        tokens = "vits-piper-is_IS-salka-medium/tokens.txt",
        dataDir = "vits-piper-is_IS-salka-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Farðu með allt, eða farðu ekki.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-is_IS-salka-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx");
    vits.setTokens("vits-piper-is_IS-salka-medium/tokens.txt");
    vits.setDataDir("vits-piper-is_IS-salka-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Farðu með allt, eða farðu ekki.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-is_IS-salka-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-is_IS-salka-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-is_IS-salka-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Farðu með allt, eða farðu ekki.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-is_IS-salka-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx",
				Tokens:  "vits-piper-is_IS-salka-medium/tokens.txt",
				DataDir: "vits-piper-is_IS-salka-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Farðu með allt, eða farðu ekki."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Farðu með allt, eða farðu ekki.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-is_IS-steinn-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/is/is_IS/steinn/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-is_IS-steinn-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-is_IS-steinn-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx";
  config.model.vits.tokens = "vits-piper-is_IS-steinn-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-is_IS-steinn-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Farðu með allt, eða farðu ekki.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-is_IS-steinn-medium.tar.bz2

You can use the following code to play with vits-piper-is_IS-steinn-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx",
            data_dir="vits-piper-is_IS-steinn-medium/espeak-ng-data",
            tokens="vits-piper-is_IS-steinn-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Farðu með allt, eða farðu ekki.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-is_IS-steinn-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx";
  config.model.vits.tokens = "vits-piper-is_IS-steinn-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-is_IS-steinn-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Farðu með allt, eða farðu ekki.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-is_IS-steinn-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx".into()),
                tokens: Some("vits-piper-is_IS-steinn-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-is_IS-steinn-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Farðu með allt, eða farðu ekki.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-is_IS-steinn-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx',
        tokens: 'vits-piper-is_IS-steinn-medium/tokens.txt',
        dataDir: 'vits-piper-is_IS-steinn-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Farðu með allt, eða farðu ekki.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-is_IS-steinn-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx',
    tokens: 'vits-piper-is_IS-steinn-medium/tokens.txt',
    dataDir: 'vits-piper-is_IS-steinn-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Farðu með allt, eða farðu ekki.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-is_IS-steinn-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-is_IS-steinn-medium/tokens.txt",
    dataDir: "vits-piper-is_IS-steinn-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Farðu með allt, eða farðu ekki."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-is_IS-steinn-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-is_IS-steinn-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-is_IS-steinn-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Farðu með allt, eða farðu ekki.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-is_IS-steinn-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx",
        tokens = "vits-piper-is_IS-steinn-medium/tokens.txt",
        dataDir = "vits-piper-is_IS-steinn-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Farðu með allt, eða farðu ekki.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-is_IS-steinn-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx");
    vits.setTokens("vits-piper-is_IS-steinn-medium/tokens.txt");
    vits.setDataDir("vits-piper-is_IS-steinn-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Farðu með allt, eða farðu ekki.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-is_IS-steinn-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-is_IS-steinn-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-is_IS-steinn-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Farðu með allt, eða farðu ekki.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-is_IS-steinn-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx",
				Tokens:  "vits-piper-is_IS-steinn-medium/tokens.txt",
				DataDir: "vits-piper-is_IS-steinn-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Farðu með allt, eða farðu ekki."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Farðu með allt, eða farðu ekki.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-is_IS-ugla-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/is/is_IS/ugla/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-is_IS-ugla-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-is_IS-ugla-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx";
  config.model.vits.tokens = "vits-piper-is_IS-ugla-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-is_IS-ugla-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Farðu með allt, eða farðu ekki.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-is_IS-ugla-medium.tar.bz2

You can use the following code to play with vits-piper-is_IS-ugla-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx",
            data_dir="vits-piper-is_IS-ugla-medium/espeak-ng-data",
            tokens="vits-piper-is_IS-ugla-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Farðu með allt, eða farðu ekki.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-is_IS-ugla-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx";
  config.model.vits.tokens = "vits-piper-is_IS-ugla-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-is_IS-ugla-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Farðu með allt, eða farðu ekki.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-is_IS-ugla-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx".into()),
                tokens: Some("vits-piper-is_IS-ugla-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-is_IS-ugla-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Farðu með allt, eða farðu ekki.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-is_IS-ugla-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx',
        tokens: 'vits-piper-is_IS-ugla-medium/tokens.txt',
        dataDir: 'vits-piper-is_IS-ugla-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Farðu með allt, eða farðu ekki.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-is_IS-ugla-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx',
    tokens: 'vits-piper-is_IS-ugla-medium/tokens.txt',
    dataDir: 'vits-piper-is_IS-ugla-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Farðu með allt, eða farðu ekki.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-is_IS-ugla-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-is_IS-ugla-medium/tokens.txt",
    dataDir: "vits-piper-is_IS-ugla-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Farðu með allt, eða farðu ekki."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-is_IS-ugla-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-is_IS-ugla-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-is_IS-ugla-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Farðu með allt, eða farðu ekki.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-is_IS-ugla-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx",
        tokens = "vits-piper-is_IS-ugla-medium/tokens.txt",
        dataDir = "vits-piper-is_IS-ugla-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Farðu með allt, eða farðu ekki.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-is_IS-ugla-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx");
    vits.setTokens("vits-piper-is_IS-ugla-medium/tokens.txt");
    vits.setDataDir("vits-piper-is_IS-ugla-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Farðu með allt, eða farðu ekki.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-is_IS-ugla-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-is_IS-ugla-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-is_IS-ugla-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Farðu með allt, eða farðu ekki.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-is_IS-ugla-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx",
				Tokens:  "vits-piper-is_IS-ugla-medium/tokens.txt",
				DataDir: "vits-piper-is_IS-ugla-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Farðu með allt, eða farðu ekki."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Farðu með allt, eða farðu ekki.

sample audios for different speakers are listed below:

Speaker 0

Indonesian

This section lists text to speech models for Indonesian.

vits-piper-id_ID-news_tts-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/id/id_ID/news_tts/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-id_ID-news_tts-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-id_ID-news_tts-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx";
  config.model.vits.tokens = "vits-piper-id_ID-news_tts-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-id_ID-news_tts-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-id_ID-news_tts-medium.tar.bz2

You can use the following code to play with vits-piper-id_ID-news_tts-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx",
            data_dir="vits-piper-id_ID-news_tts-medium/espeak-ng-data",
            tokens="vits-piper-id_ID-news_tts-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-id_ID-news_tts-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx";
  config.model.vits.tokens = "vits-piper-id_ID-news_tts-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-id_ID-news_tts-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-id_ID-news_tts-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx".into()),
                tokens: Some("vits-piper-id_ID-news_tts-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-id_ID-news_tts-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-id_ID-news_tts-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx',
        tokens: 'vits-piper-id_ID-news_tts-medium/tokens.txt',
        dataDir: 'vits-piper-id_ID-news_tts-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-id_ID-news_tts-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx',
    tokens: 'vits-piper-id_ID-news_tts-medium/tokens.txt',
    dataDir: 'vits-piper-id_ID-news_tts-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-id_ID-news_tts-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-id_ID-news_tts-medium/tokens.txt",
    dataDir: "vits-piper-id_ID-news_tts-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-id_ID-news_tts-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-id_ID-news_tts-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-id_ID-news_tts-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-id_ID-news_tts-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx",
        tokens = "vits-piper-id_ID-news_tts-medium/tokens.txt",
        dataDir = "vits-piper-id_ID-news_tts-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-id_ID-news_tts-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx");
    vits.setTokens("vits-piper-id_ID-news_tts-medium/tokens.txt");
    vits.setDataDir("vits-piper-id_ID-news_tts-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-id_ID-news_tts-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-id_ID-news_tts-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-id_ID-news_tts-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-id_ID-news_tts-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx",
				Tokens:  "vits-piper-id_ID-news_tts-medium/tokens.txt",
				DataDir: "vits-piper-id_ID-news_tts-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.

sample audios for different speakers are listed below:

Speaker 0

supertonic-3-id

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Indonesian (id).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "id"

audio = tts.generate("Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"id\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "id"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "id"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'id'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'id'},
  );
  final audio = tts.generateWithConfig(text: 'Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "id"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"id\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "id"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"id\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "id"}';

  Audio := Tts.GenerateWithConfig('Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "id"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Halo dunia.

1

Apa kabar hari ini?

2

Langit berwarna biru dan angin terasa lembut.

3

Pembelajaran mesin membantu komputer belajar dari data.

4

Sintesis ucapan mengubah teks menjadi suara yang jelas.

5

Para siswa membaca cerita pendek di perpustakaan.

6

Kereta terlambat karena perawatan rel.

7

Model kecil berjalan cepat di perangkat lokal.

8

Asisten suara membantu pekerjaan sehari-hari.

9

Pembacaan yang stabil penting untuk kalimat pendek dan panjang.

Speaker 1

0

Halo dunia.

1

Apa kabar hari ini?

2

Langit berwarna biru dan angin terasa lembut.

3

Pembelajaran mesin membantu komputer belajar dari data.

4

Sintesis ucapan mengubah teks menjadi suara yang jelas.

5

Para siswa membaca cerita pendek di perpustakaan.

6

Kereta terlambat karena perawatan rel.

7

Model kecil berjalan cepat di perangkat lokal.

8

Asisten suara membantu pekerjaan sehari-hari.

9

Pembacaan yang stabil penting untuk kalimat pendek dan panjang.

Speaker 2

0

Halo dunia.

1

Apa kabar hari ini?

2

Langit berwarna biru dan angin terasa lembut.

3

Pembelajaran mesin membantu komputer belajar dari data.

4

Sintesis ucapan mengubah teks menjadi suara yang jelas.

5

Para siswa membaca cerita pendek di perpustakaan.

6

Kereta terlambat karena perawatan rel.

7

Model kecil berjalan cepat di perangkat lokal.

8

Asisten suara membantu pekerjaan sehari-hari.

9

Pembacaan yang stabil penting untuk kalimat pendek dan panjang.

Speaker 3

0

Halo dunia.

1

Apa kabar hari ini?

2

Langit berwarna biru dan angin terasa lembut.

3

Pembelajaran mesin membantu komputer belajar dari data.

4

Sintesis ucapan mengubah teks menjadi suara yang jelas.

5

Para siswa membaca cerita pendek di perpustakaan.

6

Kereta terlambat karena perawatan rel.

7

Model kecil berjalan cepat di perangkat lokal.

8

Asisten suara membantu pekerjaan sehari-hari.

9

Pembacaan yang stabil penting untuk kalimat pendek dan panjang.

Speaker 4

0

Halo dunia.

1

Apa kabar hari ini?

2

Langit berwarna biru dan angin terasa lembut.

3

Pembelajaran mesin membantu komputer belajar dari data.

4

Sintesis ucapan mengubah teks menjadi suara yang jelas.

5

Para siswa membaca cerita pendek di perpustakaan.

6

Kereta terlambat karena perawatan rel.

7

Model kecil berjalan cepat di perangkat lokal.

8

Asisten suara membantu pekerjaan sehari-hari.

9

Pembacaan yang stabil penting untuk kalimat pendek dan panjang.

Speaker 5

0

Halo dunia.

1

Apa kabar hari ini?

2

Langit berwarna biru dan angin terasa lembut.

3

Pembelajaran mesin membantu komputer belajar dari data.

4

Sintesis ucapan mengubah teks menjadi suara yang jelas.

5

Para siswa membaca cerita pendek di perpustakaan.

6

Kereta terlambat karena perawatan rel.

7

Model kecil berjalan cepat di perangkat lokal.

8

Asisten suara membantu pekerjaan sehari-hari.

9

Pembacaan yang stabil penting untuk kalimat pendek dan panjang.

Speaker 6

0

Halo dunia.

1

Apa kabar hari ini?

2

Langit berwarna biru dan angin terasa lembut.

3

Pembelajaran mesin membantu komputer belajar dari data.

4

Sintesis ucapan mengubah teks menjadi suara yang jelas.

5

Para siswa membaca cerita pendek di perpustakaan.

6

Kereta terlambat karena perawatan rel.

7

Model kecil berjalan cepat di perangkat lokal.

8

Asisten suara membantu pekerjaan sehari-hari.

9

Pembacaan yang stabil penting untuk kalimat pendek dan panjang.

Speaker 7

0

Halo dunia.

1

Apa kabar hari ini?

2

Langit berwarna biru dan angin terasa lembut.

3

Pembelajaran mesin membantu komputer belajar dari data.

4

Sintesis ucapan mengubah teks menjadi suara yang jelas.

5

Para siswa membaca cerita pendek di perpustakaan.

6

Kereta terlambat karena perawatan rel.

7

Model kecil berjalan cepat di perangkat lokal.

8

Asisten suara membantu pekerjaan sehari-hari.

9

Pembacaan yang stabil penting untuk kalimat pendek dan panjang.

Speaker 8

0

Halo dunia.

1

Apa kabar hari ini?

2

Langit berwarna biru dan angin terasa lembut.

3

Pembelajaran mesin membantu komputer belajar dari data.

4

Sintesis ucapan mengubah teks menjadi suara yang jelas.

5

Para siswa membaca cerita pendek di perpustakaan.

6

Kereta terlambat karena perawatan rel.

7

Model kecil berjalan cepat di perangkat lokal.

8

Asisten suara membantu pekerjaan sehari-hari.

9

Pembacaan yang stabil penting untuk kalimat pendek dan panjang.

Speaker 9

0

Halo dunia.

1

Apa kabar hari ini?

2

Langit berwarna biru dan angin terasa lembut.

3

Pembelajaran mesin membantu komputer belajar dari data.

4

Sintesis ucapan mengubah teks menjadi suara yang jelas.

5

Para siswa membaca cerita pendek di perpustakaan.

6

Kereta terlambat karena perawatan rel.

7

Model kecil berjalan cepat di perangkat lokal.

8

Asisten suara membantu pekerjaan sehari-hari.

9

Pembacaan yang stabil penting untuk kalimat pendek dan panjang.

Italian

This section lists text to speech models for Italian.

vits-piper-it_IT-dii-high

Info about this model

This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_it-IT_dii

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-it_IT-dii-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-it_IT-dii-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-it_IT-dii-high/it_IT-dii-high.onnx";
  config.model.vits.tokens = "vits-piper-it_IT-dii-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-it_IT-dii-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-it_IT-dii-high.tar.bz2

You can use the following code to play with vits-piper-it_IT-dii-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-it_IT-dii-high/it_IT-dii-high.onnx",
            data_dir="vits-piper-it_IT-dii-high/espeak-ng-data",
            tokens="vits-piper-it_IT-dii-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-it_IT-dii-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-it_IT-dii-high/it_IT-dii-high.onnx";
  config.model.vits.tokens = "vits-piper-it_IT-dii-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-it_IT-dii-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-it_IT-dii-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-it_IT-dii-high/it_IT-dii-high.onnx".into()),
                tokens: Some("vits-piper-it_IT-dii-high/tokens.txt".into()),
                data_dir: Some("vits-piper-it_IT-dii-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-it_IT-dii-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-it_IT-dii-high/it_IT-dii-high.onnx',
        tokens: 'vits-piper-it_IT-dii-high/tokens.txt',
        dataDir: 'vits-piper-it_IT-dii-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-it_IT-dii-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-it_IT-dii-high/it_IT-dii-high.onnx',
    tokens: 'vits-piper-it_IT-dii-high/tokens.txt',
    dataDir: 'vits-piper-it_IT-dii-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-it_IT-dii-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-it_IT-dii-high/it_IT-dii-high.onnx",
    lexicon: "",
    tokens: "vits-piper-it_IT-dii-high/tokens.txt",
    dataDir: "vits-piper-it_IT-dii-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-it_IT-dii-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-it_IT-dii-high/it_IT-dii-high.onnx";
config.Model.Vits.Tokens = "vits-piper-it_IT-dii-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-it_IT-dii-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-it_IT-dii-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-it_IT-dii-high/it_IT-dii-high.onnx",
        tokens = "vits-piper-it_IT-dii-high/tokens.txt",
        dataDir = "vits-piper-it_IT-dii-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-it_IT-dii-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-it_IT-dii-high/it_IT-dii-high.onnx");
    vits.setTokens("vits-piper-it_IT-dii-high/tokens.txt");
    vits.setDataDir("vits-piper-it_IT-dii-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-it_IT-dii-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-it_IT-dii-high/it_IT-dii-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-it_IT-dii-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-it_IT-dii-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-it_IT-dii-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-it_IT-dii-high/it_IT-dii-high.onnx",
				Tokens:  "vits-piper-it_IT-dii-high/tokens.txt",
				DataDir: "vits-piper-it_IT-dii-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-it_IT-miro-high

Info about this model

This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_it-IT_miro

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-it_IT-miro-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-it_IT-miro-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-it_IT-miro-high/it_IT-miro-high.onnx";
  config.model.vits.tokens = "vits-piper-it_IT-miro-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-it_IT-miro-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-it_IT-miro-high.tar.bz2

You can use the following code to play with vits-piper-it_IT-miro-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-it_IT-miro-high/it_IT-miro-high.onnx",
            data_dir="vits-piper-it_IT-miro-high/espeak-ng-data",
            tokens="vits-piper-it_IT-miro-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-it_IT-miro-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-it_IT-miro-high/it_IT-miro-high.onnx";
  config.model.vits.tokens = "vits-piper-it_IT-miro-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-it_IT-miro-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-it_IT-miro-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-it_IT-miro-high/it_IT-miro-high.onnx".into()),
                tokens: Some("vits-piper-it_IT-miro-high/tokens.txt".into()),
                data_dir: Some("vits-piper-it_IT-miro-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-it_IT-miro-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-it_IT-miro-high/it_IT-miro-high.onnx',
        tokens: 'vits-piper-it_IT-miro-high/tokens.txt',
        dataDir: 'vits-piper-it_IT-miro-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-it_IT-miro-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-it_IT-miro-high/it_IT-miro-high.onnx',
    tokens: 'vits-piper-it_IT-miro-high/tokens.txt',
    dataDir: 'vits-piper-it_IT-miro-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-it_IT-miro-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-it_IT-miro-high/it_IT-miro-high.onnx",
    lexicon: "",
    tokens: "vits-piper-it_IT-miro-high/tokens.txt",
    dataDir: "vits-piper-it_IT-miro-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-it_IT-miro-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-it_IT-miro-high/it_IT-miro-high.onnx";
config.Model.Vits.Tokens = "vits-piper-it_IT-miro-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-it_IT-miro-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-it_IT-miro-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-it_IT-miro-high/it_IT-miro-high.onnx",
        tokens = "vits-piper-it_IT-miro-high/tokens.txt",
        dataDir = "vits-piper-it_IT-miro-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-it_IT-miro-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-it_IT-miro-high/it_IT-miro-high.onnx");
    vits.setTokens("vits-piper-it_IT-miro-high/tokens.txt");
    vits.setDataDir("vits-piper-it_IT-miro-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-it_IT-miro-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-it_IT-miro-high/it_IT-miro-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-it_IT-miro-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-it_IT-miro-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-it_IT-miro-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-it_IT-miro-high/it_IT-miro-high.onnx",
				Tokens:  "vits-piper-it_IT-miro-high/tokens.txt",
				DataDir: "vits-piper-it_IT-miro-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-it_IT-paola-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/it/it_IT/paola/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-it_IT-paola-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-it_IT-paola-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx";
  config.model.vits.tokens = "vits-piper-it_IT-paola-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-it_IT-paola-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-it_IT-paola-medium.tar.bz2

You can use the following code to play with vits-piper-it_IT-paola-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx",
            data_dir="vits-piper-it_IT-paola-medium/espeak-ng-data",
            tokens="vits-piper-it_IT-paola-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-it_IT-paola-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx";
  config.model.vits.tokens = "vits-piper-it_IT-paola-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-it_IT-paola-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-it_IT-paola-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx".into()),
                tokens: Some("vits-piper-it_IT-paola-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-it_IT-paola-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-it_IT-paola-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx',
        tokens: 'vits-piper-it_IT-paola-medium/tokens.txt',
        dataDir: 'vits-piper-it_IT-paola-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-it_IT-paola-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx',
    tokens: 'vits-piper-it_IT-paola-medium/tokens.txt',
    dataDir: 'vits-piper-it_IT-paola-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-it_IT-paola-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-it_IT-paola-medium/tokens.txt",
    dataDir: "vits-piper-it_IT-paola-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-it_IT-paola-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-it_IT-paola-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-it_IT-paola-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-it_IT-paola-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx",
        tokens = "vits-piper-it_IT-paola-medium/tokens.txt",
        dataDir = "vits-piper-it_IT-paola-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-it_IT-paola-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx");
    vits.setTokens("vits-piper-it_IT-paola-medium/tokens.txt");
    vits.setDataDir("vits-piper-it_IT-paola-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-it_IT-paola-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-it_IT-paola-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-it_IT-paola-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-it_IT-paola-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx",
				Tokens:  "vits-piper-it_IT-paola-medium/tokens.txt",
				DataDir: "vits-piper-it_IT-paola-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-it_IT-riccardo-x_low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/it/it_IT/riccardo/x_low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-it_IT-riccardo-x_low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-it_IT-riccardo-x_low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx";
  config.model.vits.tokens = "vits-piper-it_IT-riccardo-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-it_IT-riccardo-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-it_IT-riccardo-x_low.tar.bz2

You can use the following code to play with vits-piper-it_IT-riccardo-x_low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx",
            data_dir="vits-piper-it_IT-riccardo-x_low/espeak-ng-data",
            tokens="vits-piper-it_IT-riccardo-x_low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-it_IT-riccardo-x_low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx";
  config.model.vits.tokens = "vits-piper-it_IT-riccardo-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-it_IT-riccardo-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-it_IT-riccardo-x_low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx".into()),
                tokens: Some("vits-piper-it_IT-riccardo-x_low/tokens.txt".into()),
                data_dir: Some("vits-piper-it_IT-riccardo-x_low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-it_IT-riccardo-x_low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx',
        tokens: 'vits-piper-it_IT-riccardo-x_low/tokens.txt',
        dataDir: 'vits-piper-it_IT-riccardo-x_low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-it_IT-riccardo-x_low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx',
    tokens: 'vits-piper-it_IT-riccardo-x_low/tokens.txt',
    dataDir: 'vits-piper-it_IT-riccardo-x_low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-it_IT-riccardo-x_low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx",
    lexicon: "",
    tokens: "vits-piper-it_IT-riccardo-x_low/tokens.txt",
    dataDir: "vits-piper-it_IT-riccardo-x_low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-it_IT-riccardo-x_low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-it_IT-riccardo-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-it_IT-riccardo-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-it_IT-riccardo-x_low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx",
        tokens = "vits-piper-it_IT-riccardo-x_low/tokens.txt",
        dataDir = "vits-piper-it_IT-riccardo-x_low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-it_IT-riccardo-x_low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx");
    vits.setTokens("vits-piper-it_IT-riccardo-x_low/tokens.txt");
    vits.setDataDir("vits-piper-it_IT-riccardo-x_low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-it_IT-riccardo-x_low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-it_IT-riccardo-x_low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-it_IT-riccardo-x_low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-it_IT-riccardo-x_low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx",
				Tokens:  "vits-piper-it_IT-riccardo-x_low/tokens.txt",
				DataDir: "vits-piper-it_IT-riccardo-x_low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.

sample audios for different speakers are listed below:

Speaker 0

supertonic-3-it

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Italian (it).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "it"

audio = tts.generate("Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"it\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "it"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "it"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'it'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'it'},
  );
  final audio = tts.generateWithConfig(text: 'Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "it"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"it\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "it"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"it\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "it"}';

  Audio := Tts.GenerateWithConfig('Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "it"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Ciao mondo.

1

Come stai oggi?

2

Il cielo è blu e il vento è leggero.

3

L’apprendimento automatico aiuta i computer a imparare dai dati.

4

La sintesi vocale trasforma il testo in audio chiaro.

5

Gli studenti hanno letto una breve storia in biblioteca.

6

Il treno ha subito un ritardo per lavori sui binari.

7

I modelli piccoli funzionano rapidamente sui dispositivi locali.

8

Un assistente vocale aiuta nelle attività quotidiane.

9

Una lettura stabile è importante per frasi brevi e lunghe.

Speaker 1

0

Ciao mondo.

1

Come stai oggi?

2

Il cielo è blu e il vento è leggero.

3

L’apprendimento automatico aiuta i computer a imparare dai dati.

4

La sintesi vocale trasforma il testo in audio chiaro.

5

Gli studenti hanno letto una breve storia in biblioteca.

6

Il treno ha subito un ritardo per lavori sui binari.

7

I modelli piccoli funzionano rapidamente sui dispositivi locali.

8

Un assistente vocale aiuta nelle attività quotidiane.

9

Una lettura stabile è importante per frasi brevi e lunghe.

Speaker 2

0

Ciao mondo.

1

Come stai oggi?

2

Il cielo è blu e il vento è leggero.

3

L’apprendimento automatico aiuta i computer a imparare dai dati.

4

La sintesi vocale trasforma il testo in audio chiaro.

5

Gli studenti hanno letto una breve storia in biblioteca.

6

Il treno ha subito un ritardo per lavori sui binari.

7

I modelli piccoli funzionano rapidamente sui dispositivi locali.

8

Un assistente vocale aiuta nelle attività quotidiane.

9

Una lettura stabile è importante per frasi brevi e lunghe.

Speaker 3

0

Ciao mondo.

1

Come stai oggi?

2

Il cielo è blu e il vento è leggero.

3

L’apprendimento automatico aiuta i computer a imparare dai dati.

4

La sintesi vocale trasforma il testo in audio chiaro.

5

Gli studenti hanno letto una breve storia in biblioteca.

6

Il treno ha subito un ritardo per lavori sui binari.

7

I modelli piccoli funzionano rapidamente sui dispositivi locali.

8

Un assistente vocale aiuta nelle attività quotidiane.

9

Una lettura stabile è importante per frasi brevi e lunghe.

Speaker 4

0

Ciao mondo.

1

Come stai oggi?

2

Il cielo è blu e il vento è leggero.

3

L’apprendimento automatico aiuta i computer a imparare dai dati.

4

La sintesi vocale trasforma il testo in audio chiaro.

5

Gli studenti hanno letto una breve storia in biblioteca.

6

Il treno ha subito un ritardo per lavori sui binari.

7

I modelli piccoli funzionano rapidamente sui dispositivi locali.

8

Un assistente vocale aiuta nelle attività quotidiane.

9

Una lettura stabile è importante per frasi brevi e lunghe.

Speaker 5

0

Ciao mondo.

1

Come stai oggi?

2

Il cielo è blu e il vento è leggero.

3

L’apprendimento automatico aiuta i computer a imparare dai dati.

4

La sintesi vocale trasforma il testo in audio chiaro.

5

Gli studenti hanno letto una breve storia in biblioteca.

6

Il treno ha subito un ritardo per lavori sui binari.

7

I modelli piccoli funzionano rapidamente sui dispositivi locali.

8

Un assistente vocale aiuta nelle attività quotidiane.

9

Una lettura stabile è importante per frasi brevi e lunghe.

Speaker 6

0

Ciao mondo.

1

Come stai oggi?

2

Il cielo è blu e il vento è leggero.

3

L’apprendimento automatico aiuta i computer a imparare dai dati.

4

La sintesi vocale trasforma il testo in audio chiaro.

5

Gli studenti hanno letto una breve storia in biblioteca.

6

Il treno ha subito un ritardo per lavori sui binari.

7

I modelli piccoli funzionano rapidamente sui dispositivi locali.

8

Un assistente vocale aiuta nelle attività quotidiane.

9

Una lettura stabile è importante per frasi brevi e lunghe.

Speaker 7

0

Ciao mondo.

1

Come stai oggi?

2

Il cielo è blu e il vento è leggero.

3

L’apprendimento automatico aiuta i computer a imparare dai dati.

4

La sintesi vocale trasforma il testo in audio chiaro.

5

Gli studenti hanno letto una breve storia in biblioteca.

6

Il treno ha subito un ritardo per lavori sui binari.

7

I modelli piccoli funzionano rapidamente sui dispositivi locali.

8

Un assistente vocale aiuta nelle attività quotidiane.

9

Una lettura stabile è importante per frasi brevi e lunghe.

Speaker 8

0

Ciao mondo.

1

Come stai oggi?

2

Il cielo è blu e il vento è leggero.

3

L’apprendimento automatico aiuta i computer a imparare dai dati.

4

La sintesi vocale trasforma il testo in audio chiaro.

5

Gli studenti hanno letto una breve storia in biblioteca.

6

Il treno ha subito un ritardo per lavori sui binari.

7

I modelli piccoli funzionano rapidamente sui dispositivi locali.

8

Un assistente vocale aiuta nelle attività quotidiane.

9

Una lettura stabile è importante per frasi brevi e lunghe.

Speaker 9

0

Ciao mondo.

1

Come stai oggi?

2

Il cielo è blu e il vento è leggero.

3

L’apprendimento automatico aiuta i computer a imparare dai dati.

4

La sintesi vocale trasforma il testo in audio chiaro.

5

Gli studenti hanno letto una breve storia in biblioteca.

6

Il treno ha subito un ritardo per lavori sui binari.

7

I modelli piccoli funzionano rapidamente sui dispositivi locali.

8

Un assistente vocale aiuta nelle attività quotidiane.

9

Una lettura stabile è importante per frasi brevi e lunghe.

Japanese

This section lists text to speech models for Japanese.

supertonic-3-ja

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Japanese (ja).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "ja"

audio = tts.generate("これは次世代のkaldiを使用したテキスト読み上げエンジンです", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "これは次世代のkaldiを使用したテキスト読み上げエンジンです";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"ja\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "これは次世代のkaldiを使用したテキスト読み上げエンジンです";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "ja"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "これは次世代のkaldiを使用したテキスト読み上げエンジンです";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "ja"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'これは次世代のkaldiを使用したテキスト読み上げエンジンです';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'ja'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'ja'},
  );
  final audio = tts.generateWithConfig(text: 'これは次世代のkaldiを使用したテキスト読み上げエンジンです', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "これは次世代のkaldiを使用したテキスト読み上げエンジンです"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "ja"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "これは次世代のkaldiを使用したテキスト読み上げエンジンです";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"ja\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "ja"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "これは次世代のkaldiを使用したテキスト読み上げエンジンです",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "これは次世代のkaldiを使用したテキスト読み上げエンジンです";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"ja\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "ja"}';

  Audio := Tts.GenerateWithConfig('これは次世代のkaldiを使用したテキスト読み上げエンジンです', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "これは次世代のkaldiを使用したテキスト読み上げエンジンです"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "ja"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

こんにちは世界。

1

今日はどのように過ごしていますか。

2

空は青く、風は穏やかです。

3

機械学習はデータから学ぶ技術です。

4

音声合成は文章を自然な声に変換します。

5

図書館では多くの人が静かに本を読んでいます。

6

新しい列車の時刻表は来週から使われます。

7

研究者たちは小さな端末で動くモデルを評価しました。

8

音声アシスタントは毎日の作業を手伝います。

9

天気予報によると午後から雨が降るそうです。

Speaker 1

0

こんにちは世界。

1

今日はどのように過ごしていますか。

2

空は青く、風は穏やかです。

3

機械学習はデータから学ぶ技術です。

4

音声合成は文章を自然な声に変換します。

5

図書館では多くの人が静かに本を読んでいます。

6

新しい列車の時刻表は来週から使われます。

7

研究者たちは小さな端末で動くモデルを評価しました。

8

音声アシスタントは毎日の作業を手伝います。

9

天気予報によると午後から雨が降るそうです。

Speaker 2

0

こんにちは世界。

1

今日はどのように過ごしていますか。

2

空は青く、風は穏やかです。

3

機械学習はデータから学ぶ技術です。

4

音声合成は文章を自然な声に変換します。

5

図書館では多くの人が静かに本を読んでいます。

6

新しい列車の時刻表は来週から使われます。

7

研究者たちは小さな端末で動くモデルを評価しました。

8

音声アシスタントは毎日の作業を手伝います。

9

天気予報によると午後から雨が降るそうです。

Speaker 3

0

こんにちは世界。

1

今日はどのように過ごしていますか。

2

空は青く、風は穏やかです。

3

機械学習はデータから学ぶ技術です。

4

音声合成は文章を自然な声に変換します。

5

図書館では多くの人が静かに本を読んでいます。

6

新しい列車の時刻表は来週から使われます。

7

研究者たちは小さな端末で動くモデルを評価しました。

8

音声アシスタントは毎日の作業を手伝います。

9

天気予報によると午後から雨が降るそうです。

Speaker 4

0

こんにちは世界。

1

今日はどのように過ごしていますか。

2

空は青く、風は穏やかです。

3

機械学習はデータから学ぶ技術です。

4

音声合成は文章を自然な声に変換します。

5

図書館では多くの人が静かに本を読んでいます。

6

新しい列車の時刻表は来週から使われます。

7

研究者たちは小さな端末で動くモデルを評価しました。

8

音声アシスタントは毎日の作業を手伝います。

9

天気予報によると午後から雨が降るそうです。

Speaker 5

0

こんにちは世界。

1

今日はどのように過ごしていますか。

2

空は青く、風は穏やかです。

3

機械学習はデータから学ぶ技術です。

4

音声合成は文章を自然な声に変換します。

5

図書館では多くの人が静かに本を読んでいます。

6

新しい列車の時刻表は来週から使われます。

7

研究者たちは小さな端末で動くモデルを評価しました。

8

音声アシスタントは毎日の作業を手伝います。

9

天気予報によると午後から雨が降るそうです。

Speaker 6

0

こんにちは世界。

1

今日はどのように過ごしていますか。

2

空は青く、風は穏やかです。

3

機械学習はデータから学ぶ技術です。

4

音声合成は文章を自然な声に変換します。

5

図書館では多くの人が静かに本を読んでいます。

6

新しい列車の時刻表は来週から使われます。

7

研究者たちは小さな端末で動くモデルを評価しました。

8

音声アシスタントは毎日の作業を手伝います。

9

天気予報によると午後から雨が降るそうです。

Speaker 7

0

こんにちは世界。

1

今日はどのように過ごしていますか。

2

空は青く、風は穏やかです。

3

機械学習はデータから学ぶ技術です。

4

音声合成は文章を自然な声に変換します。

5

図書館では多くの人が静かに本を読んでいます。

6

新しい列車の時刻表は来週から使われます。

7

研究者たちは小さな端末で動くモデルを評価しました。

8

音声アシスタントは毎日の作業を手伝います。

9

天気予報によると午後から雨が降るそうです。

Speaker 8

0

こんにちは世界。

1

今日はどのように過ごしていますか。

2

空は青く、風は穏やかです。

3

機械学習はデータから学ぶ技術です。

4

音声合成は文章を自然な声に変換します。

5

図書館では多くの人が静かに本を読んでいます。

6

新しい列車の時刻表は来週から使われます。

7

研究者たちは小さな端末で動くモデルを評価しました。

8

音声アシスタントは毎日の作業を手伝います。

9

天気予報によると午後から雨が降るそうです。

Speaker 9

0

こんにちは世界。

1

今日はどのように過ごしていますか。

2

空は青く、風は穏やかです。

3

機械学習はデータから学ぶ技術です。

4

音声合成は文章を自然な声に変換します。

5

図書館では多くの人が静かに本を読んでいます。

6

新しい列車の時刻表は来週から使われます。

7

研究者たちは小さな端末で動くモデルを評価しました。

8

音声アシスタントは毎日の作業を手伝います。

9

天気予報によると午後から雨が降るそうです。

Kazakh

This section lists text to speech models for Kazakh.

vits-piper-kk_KZ-iseke-x_low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/kk/kk_KZ/iseke/x_low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-kk_KZ-iseke-x_low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx";
  config.model.vits.tokens = "vits-piper-kk_KZ-iseke-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-kk_KZ-iseke-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Әлемнің жұлдыздары сенің көзің, жаным.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-kk_KZ-iseke-x_low.tar.bz2

You can use the following code to play with vits-piper-kk_KZ-iseke-x_low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx",
            data_dir="vits-piper-kk_KZ-iseke-x_low/espeak-ng-data",
            tokens="vits-piper-kk_KZ-iseke-x_low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Әлемнің жұлдыздары сенің көзің, жаным.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx";
  config.model.vits.tokens = "vits-piper-kk_KZ-iseke-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-kk_KZ-iseke-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Әлемнің жұлдыздары сенің көзің, жаным.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx".into()),
                tokens: Some("vits-piper-kk_KZ-iseke-x_low/tokens.txt".into()),
                data_dir: Some("vits-piper-kk_KZ-iseke-x_low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Әлемнің жұлдыздары сенің көзің, жаным.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx',
        tokens: 'vits-piper-kk_KZ-iseke-x_low/tokens.txt',
        dataDir: 'vits-piper-kk_KZ-iseke-x_low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Әлемнің жұлдыздары сенің көзің, жаным.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx',
    tokens: 'vits-piper-kk_KZ-iseke-x_low/tokens.txt',
    dataDir: 'vits-piper-kk_KZ-iseke-x_low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Әлемнің жұлдыздары сенің көзің, жаным.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx",
    lexicon: "",
    tokens: "vits-piper-kk_KZ-iseke-x_low/tokens.txt",
    dataDir: "vits-piper-kk_KZ-iseke-x_low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Әлемнің жұлдыздары сенің көзің, жаным."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-kk_KZ-iseke-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-kk_KZ-iseke-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Әлемнің жұлдыздары сенің көзің, жаным.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx",
        tokens = "vits-piper-kk_KZ-iseke-x_low/tokens.txt",
        dataDir = "vits-piper-kk_KZ-iseke-x_low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Әлемнің жұлдыздары сенің көзің, жаным.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx");
    vits.setTokens("vits-piper-kk_KZ-iseke-x_low/tokens.txt");
    vits.setDataDir("vits-piper-kk_KZ-iseke-x_low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Әлемнің жұлдыздары сенің көзің, жаным.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-kk_KZ-iseke-x_low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-kk_KZ-iseke-x_low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Әлемнің жұлдыздары сенің көзің, жаным.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx",
				Tokens:  "vits-piper-kk_KZ-iseke-x_low/tokens.txt",
				DataDir: "vits-piper-kk_KZ-iseke-x_low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Әлемнің жұлдыздары сенің көзің, жаным."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Әлемнің жұлдыздары сенің көзің, жаным.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-kk_KZ-issai-high

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/kk/kk_KZ/issai/high

Number of speakersSample rate
622050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-kk_KZ-issai-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-issai-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx";
  config.model.vits.tokens = "vits-piper-kk_KZ-issai-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-kk_KZ-issai-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Әлемнің жұлдыздары сенің көзің, жаным.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-kk_KZ-issai-high.tar.bz2

You can use the following code to play with vits-piper-kk_KZ-issai-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx",
            data_dir="vits-piper-kk_KZ-issai-high/espeak-ng-data",
            tokens="vits-piper-kk_KZ-issai-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Әлемнің жұлдыздары сенің көзің, жаным.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-issai-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx";
  config.model.vits.tokens = "vits-piper-kk_KZ-issai-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-kk_KZ-issai-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Әлемнің жұлдыздары сенің көзің, жаным.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-issai-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx".into()),
                tokens: Some("vits-piper-kk_KZ-issai-high/tokens.txt".into()),
                data_dir: Some("vits-piper-kk_KZ-issai-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Әлемнің жұлдыздары сенің көзің, жаным.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-kk_KZ-issai-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx',
        tokens: 'vits-piper-kk_KZ-issai-high/tokens.txt',
        dataDir: 'vits-piper-kk_KZ-issai-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Әлемнің жұлдыздары сенің көзің, жаным.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-issai-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx',
    tokens: 'vits-piper-kk_KZ-issai-high/tokens.txt',
    dataDir: 'vits-piper-kk_KZ-issai-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Әлемнің жұлдыздары сенің көзің, жаным.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-issai-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx",
    lexicon: "",
    tokens: "vits-piper-kk_KZ-issai-high/tokens.txt",
    dataDir: "vits-piper-kk_KZ-issai-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Әлемнің жұлдыздары сенің көзің, жаным."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-issai-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx";
config.Model.Vits.Tokens = "vits-piper-kk_KZ-issai-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-kk_KZ-issai-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Әлемнің жұлдыздары сенің көзің, жаным.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-issai-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx",
        tokens = "vits-piper-kk_KZ-issai-high/tokens.txt",
        dataDir = "vits-piper-kk_KZ-issai-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Әлемнің жұлдыздары сенің көзің, жаным.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-issai-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx");
    vits.setTokens("vits-piper-kk_KZ-issai-high/tokens.txt");
    vits.setDataDir("vits-piper-kk_KZ-issai-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Әлемнің жұлдыздары сенің көзің, жаным.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-issai-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-kk_KZ-issai-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-kk_KZ-issai-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Әлемнің жұлдыздары сенің көзің, жаным.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-issai-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx",
				Tokens:  "vits-piper-kk_KZ-issai-high/tokens.txt",
				DataDir: "vits-piper-kk_KZ-issai-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Әлемнің жұлдыздары сенің көзің, жаным."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Әлемнің жұлдыздары сенің көзің, жаным.

sample audios for different speakers are listed below:

Speaker 0

Speaker 1

Speaker 2

Speaker 3

Speaker 4

Speaker 5

vits-piper-kk_KZ-raya-x_low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/kk/kk_KZ/raya/x_low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-kk_KZ-raya-x_low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-raya-x_low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx";
  config.model.vits.tokens = "vits-piper-kk_KZ-raya-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-kk_KZ-raya-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Әлемнің жұлдыздары сенің көзің, жаным.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-kk_KZ-raya-x_low.tar.bz2

You can use the following code to play with vits-piper-kk_KZ-raya-x_low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx",
            data_dir="vits-piper-kk_KZ-raya-x_low/espeak-ng-data",
            tokens="vits-piper-kk_KZ-raya-x_low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Әлемнің жұлдыздары сенің көзің, жаным.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-raya-x_low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx";
  config.model.vits.tokens = "vits-piper-kk_KZ-raya-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-kk_KZ-raya-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Әлемнің жұлдыздары сенің көзің, жаным.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-raya-x_low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx".into()),
                tokens: Some("vits-piper-kk_KZ-raya-x_low/tokens.txt".into()),
                data_dir: Some("vits-piper-kk_KZ-raya-x_low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Әлемнің жұлдыздары сенің көзің, жаным.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-kk_KZ-raya-x_low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx',
        tokens: 'vits-piper-kk_KZ-raya-x_low/tokens.txt',
        dataDir: 'vits-piper-kk_KZ-raya-x_low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Әлемнің жұлдыздары сенің көзің, жаным.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-raya-x_low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx',
    tokens: 'vits-piper-kk_KZ-raya-x_low/tokens.txt',
    dataDir: 'vits-piper-kk_KZ-raya-x_low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Әлемнің жұлдыздары сенің көзің, жаным.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-raya-x_low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx",
    lexicon: "",
    tokens: "vits-piper-kk_KZ-raya-x_low/tokens.txt",
    dataDir: "vits-piper-kk_KZ-raya-x_low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Әлемнің жұлдыздары сенің көзің, жаным."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-raya-x_low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-kk_KZ-raya-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-kk_KZ-raya-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Әлемнің жұлдыздары сенің көзің, жаным.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-raya-x_low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx",
        tokens = "vits-piper-kk_KZ-raya-x_low/tokens.txt",
        dataDir = "vits-piper-kk_KZ-raya-x_low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Әлемнің жұлдыздары сенің көзің, жаным.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-raya-x_low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx");
    vits.setTokens("vits-piper-kk_KZ-raya-x_low/tokens.txt");
    vits.setDataDir("vits-piper-kk_KZ-raya-x_low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Әлемнің жұлдыздары сенің көзің, жаным.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-raya-x_low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-kk_KZ-raya-x_low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-kk_KZ-raya-x_low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Әлемнің жұлдыздары сенің көзің, жаным.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-kk_KZ-raya-x_low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx",
				Tokens:  "vits-piper-kk_KZ-raya-x_low/tokens.txt",
				DataDir: "vits-piper-kk_KZ-raya-x_low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Әлемнің жұлдыздары сенің көзің, жаным."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Әлемнің жұлдыздары сенің көзің, жаным.

sample audios for different speakers are listed below:

Speaker 0

Korean

This section lists text to speech models for Korean.

supertonic-3-ko

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Korean (ko).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "ko"

audio = tts.generate("이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"ko\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "ko"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "ko"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = '이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'ko'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'ko'},
  );
  final audio = tts.generateWithConfig(text: '이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "ko"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"ko\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "ko"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"ko\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "ko"}';

  Audio := Tts.GenerateWithConfig('이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "ko"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

안녕하세요 세계.

1

오늘 어떻게 지내세요?

2

하늘이 푸릅니다.

3

기계학습을 사랑합니다.

4

파이썬은 놀라워요.

5

모든 분께 좋은 아침입니다.

6

인공지능이 성장하고 있습니다.

7

음성 합성은 매력적입니다.

8

신경막은 강력합니다.

9

텍스트 음성 변환이 텍스트를 오디오로 변환합니다.

10

빠른 갈색 여우가 게으른 개를 뛰어넘습니다.

11

기계학습이 컴퓨터가 데이터로 학습할 수 있게 합니다.

12

자연어 처리가 기계를 이해하도록 돕습니다.

13

딥러닝이 인공지능을 혁신했습니다.

14

음성 합성 기술이 크게 발전했습니다.

15

음성 클로닝이 음성 스타일을 복제할 수 있습니다.

16

텍스트 정규화가 올바른 발음에 중요합니다.

17

음성 비서가 기술과 상호작용하는 데 도움이 됩니다.

18

최신 TTS 시스템이 고품질 음성을 생성합니다.

19

인간 컴퓨터 상호작용이 더 직관적이 되었습니다.

Speaker 1

0

안녕하세요 세계.

1

오늘 어떻게 지내세요?

2

하늘이 푸릅니다.

3

기계학습을 사랑합니다.

4

파이썬은 놀라워요.

5

모든 분께 좋은 아침입니다.

6

인공지능이 성장하고 있습니다.

7

음성 합성은 매력적입니다.

8

신경막은 강력합니다.

9

텍스트 음성 변환이 텍스트를 오디오로 변환합니다.

10

빠른 갈색 여우가 게으른 개를 뛰어넘습니다.

11

기계학습이 컴퓨터가 데이터로 학습할 수 있게 합니다.

12

자연어 처리가 기계를 이해하도록 돕습니다.

13

딥러닝이 인공지능을 혁신했습니다.

14

음성 합성 기술이 크게 발전했습니다.

15

음성 클로닝이 음성 스타일을 복제할 수 있습니다.

16

텍스트 정규화가 올바른 발음에 중요합니다.

17

음성 비서가 기술과 상호작용하는 데 도움이 됩니다.

18

최신 TTS 시스템이 고품질 음성을 생성합니다.

19

인간 컴퓨터 상호작용이 더 직관적이 되었습니다.

Speaker 2

0

안녕하세요 세계.

1

오늘 어떻게 지내세요?

2

하늘이 푸릅니다.

3

기계학습을 사랑합니다.

4

파이썬은 놀라워요.

5

모든 분께 좋은 아침입니다.

6

인공지능이 성장하고 있습니다.

7

음성 합성은 매력적입니다.

8

신경막은 강력합니다.

9

텍스트 음성 변환이 텍스트를 오디오로 변환합니다.

10

빠른 갈색 여우가 게으른 개를 뛰어넘습니다.

11

기계학습이 컴퓨터가 데이터로 학습할 수 있게 합니다.

12

자연어 처리가 기계를 이해하도록 돕습니다.

13

딥러닝이 인공지능을 혁신했습니다.

14

음성 합성 기술이 크게 발전했습니다.

15

음성 클로닝이 음성 스타일을 복제할 수 있습니다.

16

텍스트 정규화가 올바른 발음에 중요합니다.

17

음성 비서가 기술과 상호작용하는 데 도움이 됩니다.

18

최신 TTS 시스템이 고품질 음성을 생성합니다.

19

인간 컴퓨터 상호작용이 더 직관적이 되었습니다.

Speaker 3

0

안녕하세요 세계.

1

오늘 어떻게 지내세요?

2

하늘이 푸릅니다.

3

기계학습을 사랑합니다.

4

파이썬은 놀라워요.

5

모든 분께 좋은 아침입니다.

6

인공지능이 성장하고 있습니다.

7

음성 합성은 매력적입니다.

8

신경막은 강력합니다.

9

텍스트 음성 변환이 텍스트를 오디오로 변환합니다.

10

빠른 갈색 여우가 게으른 개를 뛰어넘습니다.

11

기계학습이 컴퓨터가 데이터로 학습할 수 있게 합니다.

12

자연어 처리가 기계를 이해하도록 돕습니다.

13

딥러닝이 인공지능을 혁신했습니다.

14

음성 합성 기술이 크게 발전했습니다.

15

음성 클로닝이 음성 스타일을 복제할 수 있습니다.

16

텍스트 정규화가 올바른 발음에 중요합니다.

17

음성 비서가 기술과 상호작용하는 데 도움이 됩니다.

18

최신 TTS 시스템이 고품질 음성을 생성합니다.

19

인간 컴퓨터 상호작용이 더 직관적이 되었습니다.

Speaker 4

0

안녕하세요 세계.

1

오늘 어떻게 지내세요?

2

하늘이 푸릅니다.

3

기계학습을 사랑합니다.

4

파이썬은 놀라워요.

5

모든 분께 좋은 아침입니다.

6

인공지능이 성장하고 있습니다.

7

음성 합성은 매력적입니다.

8

신경막은 강력합니다.

9

텍스트 음성 변환이 텍스트를 오디오로 변환합니다.

10

빠른 갈색 여우가 게으른 개를 뛰어넘습니다.

11

기계학습이 컴퓨터가 데이터로 학습할 수 있게 합니다.

12

자연어 처리가 기계를 이해하도록 돕습니다.

13

딥러닝이 인공지능을 혁신했습니다.

14

음성 합성 기술이 크게 발전했습니다.

15

음성 클로닝이 음성 스타일을 복제할 수 있습니다.

16

텍스트 정규화가 올바른 발음에 중요합니다.

17

음성 비서가 기술과 상호작용하는 데 도움이 됩니다.

18

최신 TTS 시스템이 고품질 음성을 생성합니다.

19

인간 컴퓨터 상호작용이 더 직관적이 되었습니다.

Speaker 5

0

안녕하세요 세계.

1

오늘 어떻게 지내세요?

2

하늘이 푸릅니다.

3

기계학습을 사랑합니다.

4

파이썬은 놀라워요.

5

모든 분께 좋은 아침입니다.

6

인공지능이 성장하고 있습니다.

7

음성 합성은 매력적입니다.

8

신경막은 강력합니다.

9

텍스트 음성 변환이 텍스트를 오디오로 변환합니다.

10

빠른 갈색 여우가 게으른 개를 뛰어넘습니다.

11

기계학습이 컴퓨터가 데이터로 학습할 수 있게 합니다.

12

자연어 처리가 기계를 이해하도록 돕습니다.

13

딥러닝이 인공지능을 혁신했습니다.

14

음성 합성 기술이 크게 발전했습니다.

15

음성 클로닝이 음성 스타일을 복제할 수 있습니다.

16

텍스트 정규화가 올바른 발음에 중요합니다.

17

음성 비서가 기술과 상호작용하는 데 도움이 됩니다.

18

최신 TTS 시스템이 고품질 음성을 생성합니다.

19

인간 컴퓨터 상호작용이 더 직관적이 되었습니다.

Speaker 6

0

안녕하세요 세계.

1

오늘 어떻게 지내세요?

2

하늘이 푸릅니다.

3

기계학습을 사랑합니다.

4

파이썬은 놀라워요.

5

모든 분께 좋은 아침입니다.

6

인공지능이 성장하고 있습니다.

7

음성 합성은 매력적입니다.

8

신경막은 강력합니다.

9

텍스트 음성 변환이 텍스트를 오디오로 변환합니다.

10

빠른 갈색 여우가 게으른 개를 뛰어넘습니다.

11

기계학습이 컴퓨터가 데이터로 학습할 수 있게 합니다.

12

자연어 처리가 기계를 이해하도록 돕습니다.

13

딥러닝이 인공지능을 혁신했습니다.

14

음성 합성 기술이 크게 발전했습니다.

15

음성 클로닝이 음성 스타일을 복제할 수 있습니다.

16

텍스트 정규화가 올바른 발음에 중요합니다.

17

음성 비서가 기술과 상호작용하는 데 도움이 됩니다.

18

최신 TTS 시스템이 고품질 음성을 생성합니다.

19

인간 컴퓨터 상호작용이 더 직관적이 되었습니다.

Speaker 7

0

안녕하세요 세계.

1

오늘 어떻게 지내세요?

2

하늘이 푸릅니다.

3

기계학습을 사랑합니다.

4

파이썬은 놀라워요.

5

모든 분께 좋은 아침입니다.

6

인공지능이 성장하고 있습니다.

7

음성 합성은 매력적입니다.

8

신경막은 강력합니다.

9

텍스트 음성 변환이 텍스트를 오디오로 변환합니다.

10

빠른 갈색 여우가 게으른 개를 뛰어넘습니다.

11

기계학습이 컴퓨터가 데이터로 학습할 수 있게 합니다.

12

자연어 처리가 기계를 이해하도록 돕습니다.

13

딥러닝이 인공지능을 혁신했습니다.

14

음성 합성 기술이 크게 발전했습니다.

15

음성 클로닝이 음성 스타일을 복제할 수 있습니다.

16

텍스트 정규화가 올바른 발음에 중요합니다.

17

음성 비서가 기술과 상호작용하는 데 도움이 됩니다.

18

최신 TTS 시스템이 고품질 음성을 생성합니다.

19

인간 컴퓨터 상호작용이 더 직관적이 되었습니다.

Speaker 8

0

안녕하세요 세계.

1

오늘 어떻게 지내세요?

2

하늘이 푸릅니다.

3

기계학습을 사랑합니다.

4

파이썬은 놀라워요.

5

모든 분께 좋은 아침입니다.

6

인공지능이 성장하고 있습니다.

7

음성 합성은 매력적입니다.

8

신경막은 강력합니다.

9

텍스트 음성 변환이 텍스트를 오디오로 변환합니다.

10

빠른 갈색 여우가 게으른 개를 뛰어넘습니다.

11

기계학습이 컴퓨터가 데이터로 학습할 수 있게 합니다.

12

자연어 처리가 기계를 이해하도록 돕습니다.

13

딥러닝이 인공지능을 혁신했습니다.

14

음성 합성 기술이 크게 발전했습니다.

15

음성 클로닝이 음성 스타일을 복제할 수 있습니다.

16

텍스트 정규화가 올바른 발음에 중요합니다.

17

음성 비서가 기술과 상호작용하는 데 도움이 됩니다.

18

최신 TTS 시스템이 고품질 음성을 생성합니다.

19

인간 컴퓨터 상호작용이 더 직관적이 되었습니다.

Speaker 9

0

안녕하세요 세계.

1

오늘 어떻게 지내세요?

2

하늘이 푸릅니다.

3

기계학습을 사랑합니다.

4

파이썬은 놀라워요.

5

모든 분께 좋은 아침입니다.

6

인공지능이 성장하고 있습니다.

7

음성 합성은 매력적입니다.

8

신경막은 강력합니다.

9

텍스트 음성 변환이 텍스트를 오디오로 변환합니다.

10

빠른 갈색 여우가 게으른 개를 뛰어넘습니다.

11

기계학습이 컴퓨터가 데이터로 학습할 수 있게 합니다.

12

자연어 처리가 기계를 이해하도록 돕습니다.

13

딥러닝이 인공지능을 혁신했습니다.

14

음성 합성 기술이 크게 발전했습니다.

15

음성 클로닝이 음성 스타일을 복제할 수 있습니다.

16

텍스트 정규화가 올바른 발음에 중요합니다.

17

음성 비서가 기술과 상호작용하는 데 도움이 됩니다.

18

최신 TTS 시스템이 고품질 음성을 생성합니다.

19

인간 컴퓨터 상호작용이 더 직관적이 되었습니다.

Kurdish

This section lists text to speech models for Kurdish.

vits-piper-ku_TR-berfin_renas-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ku/ku_TR/berfin_renas/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ku_TR-berfin_renas-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx";
  config.model.vits.tokens = "vits-piper-ku_TR-berfin_renas-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ku_TR-berfin_renas-medium.tar.bz2

You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx",
            data_dir="vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data",
            tokens="vits-piper-ku_TR-berfin_renas-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx";
  config.model.vits.tokens = "vits-piper-ku_TR-berfin_renas-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx".into()),
                tokens: Some("vits-piper-ku_TR-berfin_renas-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx',
        tokens: 'vits-piper-ku_TR-berfin_renas-medium/tokens.txt',
        dataDir: 'vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx',
    tokens: 'vits-piper-ku_TR-berfin_renas-medium/tokens.txt',
    dataDir: 'vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-ku_TR-berfin_renas-medium/tokens.txt",
    dataDir: "vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ku_TR-berfin_renas-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx",
        tokens = "vits-piper-ku_TR-berfin_renas-medium/tokens.txt",
        dataDir = "vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx");
    vits.setTokens("vits-piper-ku_TR-berfin_renas-medium/tokens.txt");
    vits.setDataDir("vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-ku_TR-berfin_renas-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx",
				Tokens:  "vits-piper-ku_TR-berfin_renas-medium/tokens.txt",
				DataDir: "vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne

sample audios for different speakers are listed below:

Speaker 0

Latvian

This section lists text to speech models for Latvian.

vits-piper-lv_LV-aivars-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/lv/lv_LV/aivars/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-lv_LV-aivars-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-lv_LV-aivars-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx";
  config.model.vits.tokens = "vits-piper-lv_LV-aivars-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-lv_LV-aivars-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Zeme nenes augļus, ja tēvs sēj, bet māte auž.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-lv_LV-aivars-medium.tar.bz2

You can use the following code to play with vits-piper-lv_LV-aivars-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx",
            data_dir="vits-piper-lv_LV-aivars-medium/espeak-ng-data",
            tokens="vits-piper-lv_LV-aivars-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Zeme nenes augļus, ja tēvs sēj, bet māte auž.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-lv_LV-aivars-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx";
  config.model.vits.tokens = "vits-piper-lv_LV-aivars-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-lv_LV-aivars-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Zeme nenes augļus, ja tēvs sēj, bet māte auž.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-lv_LV-aivars-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx".into()),
                tokens: Some("vits-piper-lv_LV-aivars-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-lv_LV-aivars-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Zeme nenes augļus, ja tēvs sēj, bet māte auž.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-lv_LV-aivars-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx',
        tokens: 'vits-piper-lv_LV-aivars-medium/tokens.txt',
        dataDir: 'vits-piper-lv_LV-aivars-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Zeme nenes augļus, ja tēvs sēj, bet māte auž.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-lv_LV-aivars-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx',
    tokens: 'vits-piper-lv_LV-aivars-medium/tokens.txt',
    dataDir: 'vits-piper-lv_LV-aivars-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Zeme nenes augļus, ja tēvs sēj, bet māte auž.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-lv_LV-aivars-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-lv_LV-aivars-medium/tokens.txt",
    dataDir: "vits-piper-lv_LV-aivars-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Zeme nenes augļus, ja tēvs sēj, bet māte auž."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-lv_LV-aivars-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-lv_LV-aivars-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-lv_LV-aivars-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Zeme nenes augļus, ja tēvs sēj, bet māte auž.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-lv_LV-aivars-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx",
        tokens = "vits-piper-lv_LV-aivars-medium/tokens.txt",
        dataDir = "vits-piper-lv_LV-aivars-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Zeme nenes augļus, ja tēvs sēj, bet māte auž.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-lv_LV-aivars-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx");
    vits.setTokens("vits-piper-lv_LV-aivars-medium/tokens.txt");
    vits.setDataDir("vits-piper-lv_LV-aivars-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Zeme nenes augļus, ja tēvs sēj, bet māte auž.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-lv_LV-aivars-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-lv_LV-aivars-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-lv_LV-aivars-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Zeme nenes augļus, ja tēvs sēj, bet māte auž.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-lv_LV-aivars-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx",
				Tokens:  "vits-piper-lv_LV-aivars-medium/tokens.txt",
				DataDir: "vits-piper-lv_LV-aivars-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Zeme nenes augļus, ja tēvs sēj, bet māte auž."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Zeme nenes augļus, ja tēvs sēj, bet māte auž.

sample audios for different speakers are listed below:

Speaker 0

supertonic-3-lv

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Latvian (lv).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "lv"

audio = tts.generate("Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"lv\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "lv"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "lv"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'lv'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'lv'},
  );
  final audio = tts.generateWithConfig(text: 'Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "lv"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"lv\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "lv"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"lv\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "lv"}';

  Audio := Tts.GenerateWithConfig('Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "lv"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Sveika pasaule.

1

Kā tev šodien klājas?

2

Debesis ir zilas, un vējš ir maigs.

3

Mašīnmācīšanās palīdz datoriem mācīties no datiem.

4

Runas sintēze pārvērš tekstu skaidrā skaņā.

5

Skolēni bibliotēkā lasīja īsu stāstu.

6

Vilciens kavējās sliežu remonta dēļ.

7

Mazie modeļi ātri darbojas vietējās ierīcēs.

8

Balss asistents palīdz ikdienas uzdevumos.

9

Stabila lasīšana ir svarīga īsiem un gariem teikumiem.

Speaker 1

0

Sveika pasaule.

1

Kā tev šodien klājas?

2

Debesis ir zilas, un vējš ir maigs.

3

Mašīnmācīšanās palīdz datoriem mācīties no datiem.

4

Runas sintēze pārvērš tekstu skaidrā skaņā.

5

Skolēni bibliotēkā lasīja īsu stāstu.

6

Vilciens kavējās sliežu remonta dēļ.

7

Mazie modeļi ātri darbojas vietējās ierīcēs.

8

Balss asistents palīdz ikdienas uzdevumos.

9

Stabila lasīšana ir svarīga īsiem un gariem teikumiem.

Speaker 2

0

Sveika pasaule.

1

Kā tev šodien klājas?

2

Debesis ir zilas, un vējš ir maigs.

3

Mašīnmācīšanās palīdz datoriem mācīties no datiem.

4

Runas sintēze pārvērš tekstu skaidrā skaņā.

5

Skolēni bibliotēkā lasīja īsu stāstu.

6

Vilciens kavējās sliežu remonta dēļ.

7

Mazie modeļi ātri darbojas vietējās ierīcēs.

8

Balss asistents palīdz ikdienas uzdevumos.

9

Stabila lasīšana ir svarīga īsiem un gariem teikumiem.

Speaker 3

0

Sveika pasaule.

1

Kā tev šodien klājas?

2

Debesis ir zilas, un vējš ir maigs.

3

Mašīnmācīšanās palīdz datoriem mācīties no datiem.

4

Runas sintēze pārvērš tekstu skaidrā skaņā.

5

Skolēni bibliotēkā lasīja īsu stāstu.

6

Vilciens kavējās sliežu remonta dēļ.

7

Mazie modeļi ātri darbojas vietējās ierīcēs.

8

Balss asistents palīdz ikdienas uzdevumos.

9

Stabila lasīšana ir svarīga īsiem un gariem teikumiem.

Speaker 4

0

Sveika pasaule.

1

Kā tev šodien klājas?

2

Debesis ir zilas, un vējš ir maigs.

3

Mašīnmācīšanās palīdz datoriem mācīties no datiem.

4

Runas sintēze pārvērš tekstu skaidrā skaņā.

5

Skolēni bibliotēkā lasīja īsu stāstu.

6

Vilciens kavējās sliežu remonta dēļ.

7

Mazie modeļi ātri darbojas vietējās ierīcēs.

8

Balss asistents palīdz ikdienas uzdevumos.

9

Stabila lasīšana ir svarīga īsiem un gariem teikumiem.

Speaker 5

0

Sveika pasaule.

1

Kā tev šodien klājas?

2

Debesis ir zilas, un vējš ir maigs.

3

Mašīnmācīšanās palīdz datoriem mācīties no datiem.

4

Runas sintēze pārvērš tekstu skaidrā skaņā.

5

Skolēni bibliotēkā lasīja īsu stāstu.

6

Vilciens kavējās sliežu remonta dēļ.

7

Mazie modeļi ātri darbojas vietējās ierīcēs.

8

Balss asistents palīdz ikdienas uzdevumos.

9

Stabila lasīšana ir svarīga īsiem un gariem teikumiem.

Speaker 6

0

Sveika pasaule.

1

Kā tev šodien klājas?

2

Debesis ir zilas, un vējš ir maigs.

3

Mašīnmācīšanās palīdz datoriem mācīties no datiem.

4

Runas sintēze pārvērš tekstu skaidrā skaņā.

5

Skolēni bibliotēkā lasīja īsu stāstu.

6

Vilciens kavējās sliežu remonta dēļ.

7

Mazie modeļi ātri darbojas vietējās ierīcēs.

8

Balss asistents palīdz ikdienas uzdevumos.

9

Stabila lasīšana ir svarīga īsiem un gariem teikumiem.

Speaker 7

0

Sveika pasaule.

1

Kā tev šodien klājas?

2

Debesis ir zilas, un vējš ir maigs.

3

Mašīnmācīšanās palīdz datoriem mācīties no datiem.

4

Runas sintēze pārvērš tekstu skaidrā skaņā.

5

Skolēni bibliotēkā lasīja īsu stāstu.

6

Vilciens kavējās sliežu remonta dēļ.

7

Mazie modeļi ātri darbojas vietējās ierīcēs.

8

Balss asistents palīdz ikdienas uzdevumos.

9

Stabila lasīšana ir svarīga īsiem un gariem teikumiem.

Speaker 8

0

Sveika pasaule.

1

Kā tev šodien klājas?

2

Debesis ir zilas, un vējš ir maigs.

3

Mašīnmācīšanās palīdz datoriem mācīties no datiem.

4

Runas sintēze pārvērš tekstu skaidrā skaņā.

5

Skolēni bibliotēkā lasīja īsu stāstu.

6

Vilciens kavējās sliežu remonta dēļ.

7

Mazie modeļi ātri darbojas vietējās ierīcēs.

8

Balss asistents palīdz ikdienas uzdevumos.

9

Stabila lasīšana ir svarīga īsiem un gariem teikumiem.

Speaker 9

0

Sveika pasaule.

1

Kā tev šodien klājas?

2

Debesis ir zilas, un vējš ir maigs.

3

Mašīnmācīšanās palīdz datoriem mācīties no datiem.

4

Runas sintēze pārvērš tekstu skaidrā skaņā.

5

Skolēni bibliotēkā lasīja īsu stāstu.

6

Vilciens kavējās sliežu remonta dēļ.

7

Mazie modeļi ātri darbojas vietējās ierīcēs.

8

Balss asistents palīdz ikdienas uzdevumos.

9

Stabila lasīšana ir svarīga īsiem un gariem teikumiem.

Lithuanian

This section lists text to speech models for Lithuanian.

supertonic-3-lt

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Lithuanian (lt).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "lt"

audio = tts.generate("Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"lt\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "lt"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "lt"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'lt'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'lt'},
  );
  final audio = tts.generateWithConfig(text: 'Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "lt"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"lt\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "lt"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"lt\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "lt"}';

  Audio := Tts.GenerateWithConfig('Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "lt"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Labas pasauli.

1

Kaip šiandien laikaisi?

2

Dangus mėlynas, o vėjas švelnus.

3

Mašininis mokymasis padeda kompiuteriams mokytis iš duomenų.

4

Kalbos sintezė paverčia tekstą aiškiu garsu.

5

Mokiniai bibliotekoje perskaitė trumpą istoriją.

6

Traukinys vėlavo dėl bėgių priežiūros.

7

Maži modeliai greitai veikia vietiniuose įrenginiuose.

8

Balso asistentas padeda atlikti kasdienes užduotis.

9

Stabilus skaitymas svarbus trumpiems ir ilgiems sakiniams.

Speaker 1

0

Labas pasauli.

1

Kaip šiandien laikaisi?

2

Dangus mėlynas, o vėjas švelnus.

3

Mašininis mokymasis padeda kompiuteriams mokytis iš duomenų.

4

Kalbos sintezė paverčia tekstą aiškiu garsu.

5

Mokiniai bibliotekoje perskaitė trumpą istoriją.

6

Traukinys vėlavo dėl bėgių priežiūros.

7

Maži modeliai greitai veikia vietiniuose įrenginiuose.

8

Balso asistentas padeda atlikti kasdienes užduotis.

9

Stabilus skaitymas svarbus trumpiems ir ilgiems sakiniams.

Speaker 2

0

Labas pasauli.

1

Kaip šiandien laikaisi?

2

Dangus mėlynas, o vėjas švelnus.

3

Mašininis mokymasis padeda kompiuteriams mokytis iš duomenų.

4

Kalbos sintezė paverčia tekstą aiškiu garsu.

5

Mokiniai bibliotekoje perskaitė trumpą istoriją.

6

Traukinys vėlavo dėl bėgių priežiūros.

7

Maži modeliai greitai veikia vietiniuose įrenginiuose.

8

Balso asistentas padeda atlikti kasdienes užduotis.

9

Stabilus skaitymas svarbus trumpiems ir ilgiems sakiniams.

Speaker 3

0

Labas pasauli.

1

Kaip šiandien laikaisi?

2

Dangus mėlynas, o vėjas švelnus.

3

Mašininis mokymasis padeda kompiuteriams mokytis iš duomenų.

4

Kalbos sintezė paverčia tekstą aiškiu garsu.

5

Mokiniai bibliotekoje perskaitė trumpą istoriją.

6

Traukinys vėlavo dėl bėgių priežiūros.

7

Maži modeliai greitai veikia vietiniuose įrenginiuose.

8

Balso asistentas padeda atlikti kasdienes užduotis.

9

Stabilus skaitymas svarbus trumpiems ir ilgiems sakiniams.

Speaker 4

0

Labas pasauli.

1

Kaip šiandien laikaisi?

2

Dangus mėlynas, o vėjas švelnus.

3

Mašininis mokymasis padeda kompiuteriams mokytis iš duomenų.

4

Kalbos sintezė paverčia tekstą aiškiu garsu.

5

Mokiniai bibliotekoje perskaitė trumpą istoriją.

6

Traukinys vėlavo dėl bėgių priežiūros.

7

Maži modeliai greitai veikia vietiniuose įrenginiuose.

8

Balso asistentas padeda atlikti kasdienes užduotis.

9

Stabilus skaitymas svarbus trumpiems ir ilgiems sakiniams.

Speaker 5

0

Labas pasauli.

1

Kaip šiandien laikaisi?

2

Dangus mėlynas, o vėjas švelnus.

3

Mašininis mokymasis padeda kompiuteriams mokytis iš duomenų.

4

Kalbos sintezė paverčia tekstą aiškiu garsu.

5

Mokiniai bibliotekoje perskaitė trumpą istoriją.

6

Traukinys vėlavo dėl bėgių priežiūros.

7

Maži modeliai greitai veikia vietiniuose įrenginiuose.

8

Balso asistentas padeda atlikti kasdienes užduotis.

9

Stabilus skaitymas svarbus trumpiems ir ilgiems sakiniams.

Speaker 6

0

Labas pasauli.

1

Kaip šiandien laikaisi?

2

Dangus mėlynas, o vėjas švelnus.

3

Mašininis mokymasis padeda kompiuteriams mokytis iš duomenų.

4

Kalbos sintezė paverčia tekstą aiškiu garsu.

5

Mokiniai bibliotekoje perskaitė trumpą istoriją.

6

Traukinys vėlavo dėl bėgių priežiūros.

7

Maži modeliai greitai veikia vietiniuose įrenginiuose.

8

Balso asistentas padeda atlikti kasdienes užduotis.

9

Stabilus skaitymas svarbus trumpiems ir ilgiems sakiniams.

Speaker 7

0

Labas pasauli.

1

Kaip šiandien laikaisi?

2

Dangus mėlynas, o vėjas švelnus.

3

Mašininis mokymasis padeda kompiuteriams mokytis iš duomenų.

4

Kalbos sintezė paverčia tekstą aiškiu garsu.

5

Mokiniai bibliotekoje perskaitė trumpą istoriją.

6

Traukinys vėlavo dėl bėgių priežiūros.

7

Maži modeliai greitai veikia vietiniuose įrenginiuose.

8

Balso asistentas padeda atlikti kasdienes užduotis.

9

Stabilus skaitymas svarbus trumpiems ir ilgiems sakiniams.

Speaker 8

0

Labas pasauli.

1

Kaip šiandien laikaisi?

2

Dangus mėlynas, o vėjas švelnus.

3

Mašininis mokymasis padeda kompiuteriams mokytis iš duomenų.

4

Kalbos sintezė paverčia tekstą aiškiu garsu.

5

Mokiniai bibliotekoje perskaitė trumpą istoriją.

6

Traukinys vėlavo dėl bėgių priežiūros.

7

Maži modeliai greitai veikia vietiniuose įrenginiuose.

8

Balso asistentas padeda atlikti kasdienes užduotis.

9

Stabilus skaitymas svarbus trumpiems ir ilgiems sakiniams.

Speaker 9

0

Labas pasauli.

1

Kaip šiandien laikaisi?

2

Dangus mėlynas, o vėjas švelnus.

3

Mašininis mokymasis padeda kompiuteriams mokytis iš duomenų.

4

Kalbos sintezė paverčia tekstą aiškiu garsu.

5

Mokiniai bibliotekoje perskaitė trumpą istoriją.

6

Traukinys vėlavo dėl bėgių priežiūros.

7

Maži modeliai greitai veikia vietiniuose įrenginiuose.

8

Balso asistentas padeda atlikti kasdienes užduotis.

9

Stabilus skaitymas svarbus trumpiems ir ilgiems sakiniams.

Luxembourgish

This section lists text to speech models for Luxembourgish.

vits-piper-lb_LU-marylux-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/lb/lb_LU/marylux/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-lb_LU-marylux-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-lb_LU-marylux-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx";
  config.model.vits.tokens = "vits-piper-lb_LU-marylux-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-lb_LU-marylux-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-lb_LU-marylux-medium.tar.bz2

You can use the following code to play with vits-piper-lb_LU-marylux-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx",
            data_dir="vits-piper-lb_LU-marylux-medium/espeak-ng-data",
            tokens="vits-piper-lb_LU-marylux-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-lb_LU-marylux-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx";
  config.model.vits.tokens = "vits-piper-lb_LU-marylux-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-lb_LU-marylux-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-lb_LU-marylux-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx".into()),
                tokens: Some("vits-piper-lb_LU-marylux-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-lb_LU-marylux-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-lb_LU-marylux-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx',
        tokens: 'vits-piper-lb_LU-marylux-medium/tokens.txt',
        dataDir: 'vits-piper-lb_LU-marylux-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-lb_LU-marylux-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx',
    tokens: 'vits-piper-lb_LU-marylux-medium/tokens.txt',
    dataDir: 'vits-piper-lb_LU-marylux-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-lb_LU-marylux-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-lb_LU-marylux-medium/tokens.txt",
    dataDir: "vits-piper-lb_LU-marylux-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-lb_LU-marylux-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-lb_LU-marylux-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-lb_LU-marylux-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-lb_LU-marylux-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx",
        tokens = "vits-piper-lb_LU-marylux-medium/tokens.txt",
        dataDir = "vits-piper-lb_LU-marylux-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-lb_LU-marylux-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx");
    vits.setTokens("vits-piper-lb_LU-marylux-medium/tokens.txt");
    vits.setDataDir("vits-piper-lb_LU-marylux-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-lb_LU-marylux-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-lb_LU-marylux-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-lb_LU-marylux-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-lb_LU-marylux-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx",
				Tokens:  "vits-piper-lb_LU-marylux-medium/tokens.txt",
				DataDir: "vits-piper-lb_LU-marylux-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.

sample audios for different speakers are listed below:

Speaker 0

Malayalam

This section lists text to speech models for Malayalam.

vits-piper-ml_IN-arjun-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ml/ml_IN/arjun/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ml_IN-arjun-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-ml_IN-arjun-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx";
  config.model.vits.tokens = "vits-piper-ml_IN-arjun-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ml_IN-arjun-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ml_IN-arjun-medium.tar.bz2

You can use the following code to play with vits-piper-ml_IN-arjun-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx",
            data_dir="vits-piper-ml_IN-arjun-medium/espeak-ng-data",
            tokens="vits-piper-ml_IN-arjun-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-ml_IN-arjun-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx";
  config.model.vits.tokens = "vits-piper-ml_IN-arjun-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ml_IN-arjun-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-ml_IN-arjun-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx".into()),
                tokens: Some("vits-piper-ml_IN-arjun-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-ml_IN-arjun-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-ml_IN-arjun-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx',
        tokens: 'vits-piper-ml_IN-arjun-medium/tokens.txt',
        dataDir: 'vits-piper-ml_IN-arjun-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-ml_IN-arjun-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx',
    tokens: 'vits-piper-ml_IN-arjun-medium/tokens.txt',
    dataDir: 'vits-piper-ml_IN-arjun-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-ml_IN-arjun-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-ml_IN-arjun-medium/tokens.txt",
    dataDir: "vits-piper-ml_IN-arjun-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-ml_IN-arjun-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ml_IN-arjun-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ml_IN-arjun-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-ml_IN-arjun-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx",
        tokens = "vits-piper-ml_IN-arjun-medium/tokens.txt",
        dataDir = "vits-piper-ml_IN-arjun-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-ml_IN-arjun-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx");
    vits.setTokens("vits-piper-ml_IN-arjun-medium/tokens.txt");
    vits.setDataDir("vits-piper-ml_IN-arjun-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-ml_IN-arjun-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-ml_IN-arjun-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-ml_IN-arjun-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-ml_IN-arjun-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx",
				Tokens:  "vits-piper-ml_IN-arjun-medium/tokens.txt",
				DataDir: "vits-piper-ml_IN-arjun-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-ml_IN-meera-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ml/ml_IN/meera/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ml_IN-meera-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-ml_IN-meera-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx";
  config.model.vits.tokens = "vits-piper-ml_IN-meera-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ml_IN-meera-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ml_IN-meera-medium.tar.bz2

You can use the following code to play with vits-piper-ml_IN-meera-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx",
            data_dir="vits-piper-ml_IN-meera-medium/espeak-ng-data",
            tokens="vits-piper-ml_IN-meera-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-ml_IN-meera-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx";
  config.model.vits.tokens = "vits-piper-ml_IN-meera-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ml_IN-meera-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-ml_IN-meera-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx".into()),
                tokens: Some("vits-piper-ml_IN-meera-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-ml_IN-meera-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-ml_IN-meera-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx',
        tokens: 'vits-piper-ml_IN-meera-medium/tokens.txt',
        dataDir: 'vits-piper-ml_IN-meera-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-ml_IN-meera-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx',
    tokens: 'vits-piper-ml_IN-meera-medium/tokens.txt',
    dataDir: 'vits-piper-ml_IN-meera-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-ml_IN-meera-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-ml_IN-meera-medium/tokens.txt",
    dataDir: "vits-piper-ml_IN-meera-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-ml_IN-meera-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ml_IN-meera-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ml_IN-meera-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-ml_IN-meera-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx",
        tokens = "vits-piper-ml_IN-meera-medium/tokens.txt",
        dataDir = "vits-piper-ml_IN-meera-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-ml_IN-meera-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx");
    vits.setTokens("vits-piper-ml_IN-meera-medium/tokens.txt");
    vits.setDataDir("vits-piper-ml_IN-meera-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-ml_IN-meera-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-ml_IN-meera-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-ml_IN-meera-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-ml_IN-meera-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx",
				Tokens:  "vits-piper-ml_IN-meera-medium/tokens.txt",
				DataDir: "vits-piper-ml_IN-meera-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.

sample audios for different speakers are listed below:

Speaker 0

Nepali

This section lists text to speech models for Nepali.

vits-piper-ne_NP-chitwan-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ne/ne_NP/chitwan/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ne_NP-chitwan-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-ne_NP-chitwan-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx";
  config.model.vits.tokens = "vits-piper-ne_NP-chitwan-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ne_NP-chitwan-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ne_NP-chitwan-medium.tar.bz2

You can use the following code to play with vits-piper-ne_NP-chitwan-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx",
            data_dir="vits-piper-ne_NP-chitwan-medium/espeak-ng-data",
            tokens="vits-piper-ne_NP-chitwan-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-ne_NP-chitwan-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx";
  config.model.vits.tokens = "vits-piper-ne_NP-chitwan-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ne_NP-chitwan-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-ne_NP-chitwan-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx".into()),
                tokens: Some("vits-piper-ne_NP-chitwan-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-ne_NP-chitwan-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-ne_NP-chitwan-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx',
        tokens: 'vits-piper-ne_NP-chitwan-medium/tokens.txt',
        dataDir: 'vits-piper-ne_NP-chitwan-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-ne_NP-chitwan-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx',
    tokens: 'vits-piper-ne_NP-chitwan-medium/tokens.txt',
    dataDir: 'vits-piper-ne_NP-chitwan-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-ne_NP-chitwan-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-ne_NP-chitwan-medium/tokens.txt",
    dataDir: "vits-piper-ne_NP-chitwan-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-ne_NP-chitwan-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ne_NP-chitwan-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ne_NP-chitwan-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-ne_NP-chitwan-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx",
        tokens = "vits-piper-ne_NP-chitwan-medium/tokens.txt",
        dataDir = "vits-piper-ne_NP-chitwan-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-ne_NP-chitwan-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx");
    vits.setTokens("vits-piper-ne_NP-chitwan-medium/tokens.txt");
    vits.setDataDir("vits-piper-ne_NP-chitwan-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-ne_NP-chitwan-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-ne_NP-chitwan-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-ne_NP-chitwan-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-ne_NP-chitwan-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx",
				Tokens:  "vits-piper-ne_NP-chitwan-medium/tokens.txt",
				DataDir: "vits-piper-ne_NP-chitwan-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।

sample audios for different speakers are listed below:

Speaker 0

vits-piper-ne_NP-google-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ne/ne_NP/google/medium

Number of speakersSample rate
1822050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ne_NP-google-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-ne_NP-google-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx";
  config.model.vits.tokens = "vits-piper-ne_NP-google-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ne_NP-google-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ne_NP-google-medium.tar.bz2

You can use the following code to play with vits-piper-ne_NP-google-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx",
            data_dir="vits-piper-ne_NP-google-medium/espeak-ng-data",
            tokens="vits-piper-ne_NP-google-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-ne_NP-google-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx";
  config.model.vits.tokens = "vits-piper-ne_NP-google-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ne_NP-google-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-ne_NP-google-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx".into()),
                tokens: Some("vits-piper-ne_NP-google-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-ne_NP-google-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-ne_NP-google-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx',
        tokens: 'vits-piper-ne_NP-google-medium/tokens.txt',
        dataDir: 'vits-piper-ne_NP-google-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-ne_NP-google-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx',
    tokens: 'vits-piper-ne_NP-google-medium/tokens.txt',
    dataDir: 'vits-piper-ne_NP-google-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-ne_NP-google-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-ne_NP-google-medium/tokens.txt",
    dataDir: "vits-piper-ne_NP-google-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-ne_NP-google-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ne_NP-google-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ne_NP-google-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-ne_NP-google-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx",
        tokens = "vits-piper-ne_NP-google-medium/tokens.txt",
        dataDir = "vits-piper-ne_NP-google-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-ne_NP-google-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx");
    vits.setTokens("vits-piper-ne_NP-google-medium/tokens.txt");
    vits.setDataDir("vits-piper-ne_NP-google-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-ne_NP-google-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-ne_NP-google-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-ne_NP-google-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-ne_NP-google-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx",
				Tokens:  "vits-piper-ne_NP-google-medium/tokens.txt",
				DataDir: "vits-piper-ne_NP-google-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।

sample audios for different speakers are listed below:

Speaker 0

Speaker 1

Speaker 2

Speaker 3

Speaker 4

Speaker 5

Speaker 6

Speaker 7

Speaker 8

Speaker 9

Speaker 10

Speaker 11

Speaker 12

Speaker 13

Speaker 14

Speaker 15

Speaker 16

Speaker 17

vits-piper-ne_NP-google-x_low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ne/ne_NP/google/x_low

Number of speakersSample rate
1816000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ne_NP-google-x_low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-ne_NP-google-x_low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx";
  config.model.vits.tokens = "vits-piper-ne_NP-google-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ne_NP-google-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ne_NP-google-x_low.tar.bz2

You can use the following code to play with vits-piper-ne_NP-google-x_low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx",
            data_dir="vits-piper-ne_NP-google-x_low/espeak-ng-data",
            tokens="vits-piper-ne_NP-google-x_low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-ne_NP-google-x_low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx";
  config.model.vits.tokens = "vits-piper-ne_NP-google-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ne_NP-google-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-ne_NP-google-x_low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx".into()),
                tokens: Some("vits-piper-ne_NP-google-x_low/tokens.txt".into()),
                data_dir: Some("vits-piper-ne_NP-google-x_low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-ne_NP-google-x_low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx',
        tokens: 'vits-piper-ne_NP-google-x_low/tokens.txt',
        dataDir: 'vits-piper-ne_NP-google-x_low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-ne_NP-google-x_low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx',
    tokens: 'vits-piper-ne_NP-google-x_low/tokens.txt',
    dataDir: 'vits-piper-ne_NP-google-x_low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-ne_NP-google-x_low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx",
    lexicon: "",
    tokens: "vits-piper-ne_NP-google-x_low/tokens.txt",
    dataDir: "vits-piper-ne_NP-google-x_low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-ne_NP-google-x_low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-ne_NP-google-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ne_NP-google-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-ne_NP-google-x_low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx",
        tokens = "vits-piper-ne_NP-google-x_low/tokens.txt",
        dataDir = "vits-piper-ne_NP-google-x_low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-ne_NP-google-x_low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx");
    vits.setTokens("vits-piper-ne_NP-google-x_low/tokens.txt");
    vits.setDataDir("vits-piper-ne_NP-google-x_low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-ne_NP-google-x_low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-ne_NP-google-x_low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-ne_NP-google-x_low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-ne_NP-google-x_low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx",
				Tokens:  "vits-piper-ne_NP-google-x_low/tokens.txt",
				DataDir: "vits-piper-ne_NP-google-x_low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।

sample audios for different speakers are listed below:

Speaker 0

Speaker 1

Speaker 2

Speaker 3

Speaker 4

Speaker 5

Speaker 6

Speaker 7

Speaker 8

Speaker 9

Speaker 10

Speaker 11

Speaker 12

Speaker 13

Speaker 14

Speaker 15

Speaker 16

Speaker 17

Norwegian

This section lists text to speech models for Norwegian.

vits-piper-no_NO-talesyntese-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/no/no_NO/talesyntese/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-no_NO-talesyntese-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-no_NO-talesyntese-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx";
  config.model.vits.tokens = "vits-piper-no_NO-talesyntese-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-no_NO-talesyntese-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Uskyldig kan stormen veroorzaken";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-no_NO-talesyntese-medium.tar.bz2

You can use the following code to play with vits-piper-no_NO-talesyntese-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx",
            data_dir="vits-piper-no_NO-talesyntese-medium/espeak-ng-data",
            tokens="vits-piper-no_NO-talesyntese-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Uskyldig kan stormen veroorzaken",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-no_NO-talesyntese-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx";
  config.model.vits.tokens = "vits-piper-no_NO-talesyntese-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-no_NO-talesyntese-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Uskyldig kan stormen veroorzaken";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-no_NO-talesyntese-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx".into()),
                tokens: Some("vits-piper-no_NO-talesyntese-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-no_NO-talesyntese-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Uskyldig kan stormen veroorzaken";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-no_NO-talesyntese-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx',
        tokens: 'vits-piper-no_NO-talesyntese-medium/tokens.txt',
        dataDir: 'vits-piper-no_NO-talesyntese-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Uskyldig kan stormen veroorzaken';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-no_NO-talesyntese-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx',
    tokens: 'vits-piper-no_NO-talesyntese-medium/tokens.txt',
    dataDir: 'vits-piper-no_NO-talesyntese-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Uskyldig kan stormen veroorzaken', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-no_NO-talesyntese-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-no_NO-talesyntese-medium/tokens.txt",
    dataDir: "vits-piper-no_NO-talesyntese-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Uskyldig kan stormen veroorzaken"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-no_NO-talesyntese-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-no_NO-talesyntese-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-no_NO-talesyntese-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Uskyldig kan stormen veroorzaken";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-no_NO-talesyntese-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx",
        tokens = "vits-piper-no_NO-talesyntese-medium/tokens.txt",
        dataDir = "vits-piper-no_NO-talesyntese-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Uskyldig kan stormen veroorzaken",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-no_NO-talesyntese-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx");
    vits.setTokens("vits-piper-no_NO-talesyntese-medium/tokens.txt");
    vits.setDataDir("vits-piper-no_NO-talesyntese-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Uskyldig kan stormen veroorzaken";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-no_NO-talesyntese-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-no_NO-talesyntese-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-no_NO-talesyntese-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Uskyldig kan stormen veroorzaken', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-no_NO-talesyntese-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx",
				Tokens:  "vits-piper-no_NO-talesyntese-medium/tokens.txt",
				DataDir: "vits-piper-no_NO-talesyntese-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Uskyldig kan stormen veroorzaken"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Uskyldig kan stormen veroorzaken

sample audios for different speakers are listed below:

Speaker 0

Persian

This section lists text to speech models for Persian.

vits-piper-fa_IR-amir-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fa/fa_IR/amir/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fa_IR-amir-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-fa_IR-amir-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx";
  config.model.vits.tokens = "vits-piper-fa_IR-amir-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fa_IR-amir-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fa_IR-amir-medium.tar.bz2

You can use the following code to play with vits-piper-fa_IR-amir-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx",
            data_dir="vits-piper-fa_IR-amir-medium/espeak-ng-data",
            tokens="vits-piper-fa_IR-amir-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-fa_IR-amir-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx";
  config.model.vits.tokens = "vits-piper-fa_IR-amir-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fa_IR-amir-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-fa_IR-amir-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx".into()),
                tokens: Some("vits-piper-fa_IR-amir-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-fa_IR-amir-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-fa_IR-amir-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx',
        tokens: 'vits-piper-fa_IR-amir-medium/tokens.txt',
        dataDir: 'vits-piper-fa_IR-amir-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-fa_IR-amir-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx',
    tokens: 'vits-piper-fa_IR-amir-medium/tokens.txt',
    dataDir: 'vits-piper-fa_IR-amir-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-fa_IR-amir-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-fa_IR-amir-medium/tokens.txt",
    dataDir: "vits-piper-fa_IR-amir-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-fa_IR-amir-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-fa_IR-amir-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fa_IR-amir-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-fa_IR-amir-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx",
        tokens = "vits-piper-fa_IR-amir-medium/tokens.txt",
        dataDir = "vits-piper-fa_IR-amir-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-fa_IR-amir-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx");
    vits.setTokens("vits-piper-fa_IR-amir-medium/tokens.txt");
    vits.setDataDir("vits-piper-fa_IR-amir-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-fa_IR-amir-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-fa_IR-amir-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-fa_IR-amir-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-fa_IR-amir-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx",
				Tokens:  "vits-piper-fa_IR-amir-medium/tokens.txt",
				DataDir: "vits-piper-fa_IR-amir-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-fa_IR-ganji-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fa/fa_IR/ganji/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fa_IR-ganji-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-fa_IR-ganji-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx";
  config.model.vits.tokens = "vits-piper-fa_IR-ganji-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fa_IR-ganji-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fa_IR-ganji-medium.tar.bz2

You can use the following code to play with vits-piper-fa_IR-ganji-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx",
            data_dir="vits-piper-fa_IR-ganji-medium/espeak-ng-data",
            tokens="vits-piper-fa_IR-ganji-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-fa_IR-ganji-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx";
  config.model.vits.tokens = "vits-piper-fa_IR-ganji-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fa_IR-ganji-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-fa_IR-ganji-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx".into()),
                tokens: Some("vits-piper-fa_IR-ganji-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-fa_IR-ganji-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-fa_IR-ganji-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx',
        tokens: 'vits-piper-fa_IR-ganji-medium/tokens.txt',
        dataDir: 'vits-piper-fa_IR-ganji-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-fa_IR-ganji-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx',
    tokens: 'vits-piper-fa_IR-ganji-medium/tokens.txt',
    dataDir: 'vits-piper-fa_IR-ganji-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-fa_IR-ganji-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-fa_IR-ganji-medium/tokens.txt",
    dataDir: "vits-piper-fa_IR-ganji-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-fa_IR-ganji-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-fa_IR-ganji-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fa_IR-ganji-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-fa_IR-ganji-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx",
        tokens = "vits-piper-fa_IR-ganji-medium/tokens.txt",
        dataDir = "vits-piper-fa_IR-ganji-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-fa_IR-ganji-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx");
    vits.setTokens("vits-piper-fa_IR-ganji-medium/tokens.txt");
    vits.setDataDir("vits-piper-fa_IR-ganji-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-fa_IR-ganji-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-fa_IR-ganji-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-fa_IR-ganji-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-fa_IR-ganji-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx",
				Tokens:  "vits-piper-fa_IR-ganji-medium/tokens.txt",
				DataDir: "vits-piper-fa_IR-ganji-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-fa_IR-ganji_adabi-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fa/fa_IR/ganji_adabi/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fa_IR-ganji_adabi-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx";
  config.model.vits.tokens = "vits-piper-fa_IR-ganji_adabi-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fa_IR-ganji_adabi-medium.tar.bz2

You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx",
            data_dir="vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data",
            tokens="vits-piper-fa_IR-ganji_adabi-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx";
  config.model.vits.tokens = "vits-piper-fa_IR-ganji_adabi-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx".into()),
                tokens: Some("vits-piper-fa_IR-ganji_adabi-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx',
        tokens: 'vits-piper-fa_IR-ganji_adabi-medium/tokens.txt',
        dataDir: 'vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx',
    tokens: 'vits-piper-fa_IR-ganji_adabi-medium/tokens.txt',
    dataDir: 'vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-fa_IR-ganji_adabi-medium/tokens.txt",
    dataDir: "vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-fa_IR-ganji_adabi-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx",
        tokens = "vits-piper-fa_IR-ganji_adabi-medium/tokens.txt",
        dataDir = "vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx");
    vits.setTokens("vits-piper-fa_IR-ganji_adabi-medium/tokens.txt");
    vits.setDataDir("vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-fa_IR-ganji_adabi-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx",
				Tokens:  "vits-piper-fa_IR-ganji_adabi-medium/tokens.txt",
				DataDir: "vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-fa_IR-gyro-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fa/fa_IR/gyro/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fa_IR-gyro-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-fa_IR-gyro-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx";
  config.model.vits.tokens = "vits-piper-fa_IR-gyro-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fa_IR-gyro-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fa_IR-gyro-medium.tar.bz2

You can use the following code to play with vits-piper-fa_IR-gyro-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx",
            data_dir="vits-piper-fa_IR-gyro-medium/espeak-ng-data",
            tokens="vits-piper-fa_IR-gyro-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-fa_IR-gyro-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx";
  config.model.vits.tokens = "vits-piper-fa_IR-gyro-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fa_IR-gyro-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-fa_IR-gyro-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx".into()),
                tokens: Some("vits-piper-fa_IR-gyro-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-fa_IR-gyro-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-fa_IR-gyro-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx',
        tokens: 'vits-piper-fa_IR-gyro-medium/tokens.txt',
        dataDir: 'vits-piper-fa_IR-gyro-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-fa_IR-gyro-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx',
    tokens: 'vits-piper-fa_IR-gyro-medium/tokens.txt',
    dataDir: 'vits-piper-fa_IR-gyro-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-fa_IR-gyro-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-fa_IR-gyro-medium/tokens.txt",
    dataDir: "vits-piper-fa_IR-gyro-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-fa_IR-gyro-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-fa_IR-gyro-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fa_IR-gyro-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-fa_IR-gyro-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx",
        tokens = "vits-piper-fa_IR-gyro-medium/tokens.txt",
        dataDir = "vits-piper-fa_IR-gyro-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-fa_IR-gyro-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx");
    vits.setTokens("vits-piper-fa_IR-gyro-medium/tokens.txt");
    vits.setDataDir("vits-piper-fa_IR-gyro-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-fa_IR-gyro-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-fa_IR-gyro-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-fa_IR-gyro-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-fa_IR-gyro-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx",
				Tokens:  "vits-piper-fa_IR-gyro-medium/tokens.txt",
				DataDir: "vits-piper-fa_IR-gyro-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-fa_IR-reza_ibrahim-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fa/fa_IR/reza_ibrahim/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fa_IR-reza_ibrahim-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx";
  config.model.vits.tokens = "vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fa_IR-reza_ibrahim-medium.tar.bz2

You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx",
            data_dir="vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data",
            tokens="vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx";
  config.model.vits.tokens = "vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx".into()),
                tokens: Some("vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx',
        tokens: 'vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt',
        dataDir: 'vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx',
    tokens: 'vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt',
    dataDir: 'vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt",
    dataDir: "vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx",
        tokens = "vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt",
        dataDir = "vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx");
    vits.setTokens("vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt");
    vits.setDataDir("vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx",
				Tokens:  "vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt",
				DataDir: "vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.

sample audios for different speakers are listed below:

Speaker 0

Polish

This section lists text to speech models for Polish.

vits-piper-pl_PL-bass-high

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/pl/pl_PL/bass/high

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pl_PL-bass-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-pl_PL-bass-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx";
  config.model.vits.tokens = "vits-piper-pl_PL-bass-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pl_PL-bass-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pl_PL-bass-high.tar.bz2

You can use the following code to play with vits-piper-pl_PL-bass-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx",
            data_dir="vits-piper-pl_PL-bass-high/espeak-ng-data",
            tokens="vits-piper-pl_PL-bass-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nieważne, za kogo walczysz, i tak popełnisz błąd",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-pl_PL-bass-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx";
  config.model.vits.tokens = "vits-piper-pl_PL-bass-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pl_PL-bass-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-pl_PL-bass-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx".into()),
                tokens: Some("vits-piper-pl_PL-bass-high/tokens.txt".into()),
                data_dir: Some("vits-piper-pl_PL-bass-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-pl_PL-bass-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx',
        tokens: 'vits-piper-pl_PL-bass-high/tokens.txt',
        dataDir: 'vits-piper-pl_PL-bass-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Nieważne, za kogo walczysz, i tak popełnisz błąd';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-pl_PL-bass-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx',
    tokens: 'vits-piper-pl_PL-bass-high/tokens.txt',
    dataDir: 'vits-piper-pl_PL-bass-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Nieważne, za kogo walczysz, i tak popełnisz błąd', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-pl_PL-bass-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx",
    lexicon: "",
    tokens: "vits-piper-pl_PL-bass-high/tokens.txt",
    dataDir: "vits-piper-pl_PL-bass-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-pl_PL-bass-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx";
config.Model.Vits.Tokens = "vits-piper-pl_PL-bass-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pl_PL-bass-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-pl_PL-bass-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx",
        tokens = "vits-piper-pl_PL-bass-high/tokens.txt",
        dataDir = "vits-piper-pl_PL-bass-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Nieważne, za kogo walczysz, i tak popełnisz błąd",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-pl_PL-bass-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx");
    vits.setTokens("vits-piper-pl_PL-bass-high/tokens.txt");
    vits.setDataDir("vits-piper-pl_PL-bass-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-pl_PL-bass-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-pl_PL-bass-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-pl_PL-bass-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Nieważne, za kogo walczysz, i tak popełnisz błąd', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-pl_PL-bass-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx",
				Tokens:  "vits-piper-pl_PL-bass-high/tokens.txt",
				DataDir: "vits-piper-pl_PL-bass-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Nieważne, za kogo walczysz, i tak popełnisz błąd"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Nieważne, za kogo walczysz, i tak popełnisz błąd

sample audios for different speakers are listed below:

Speaker 0

vits-piper-pl_PL-darkman-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/pl/pl_PL/darkman/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pl_PL-darkman-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-pl_PL-darkman-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx";
  config.model.vits.tokens = "vits-piper-pl_PL-darkman-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pl_PL-darkman-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pl_PL-darkman-medium.tar.bz2

You can use the following code to play with vits-piper-pl_PL-darkman-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx",
            data_dir="vits-piper-pl_PL-darkman-medium/espeak-ng-data",
            tokens="vits-piper-pl_PL-darkman-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nieważne, za kogo walczysz, i tak popełnisz błąd",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-pl_PL-darkman-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx";
  config.model.vits.tokens = "vits-piper-pl_PL-darkman-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pl_PL-darkman-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-pl_PL-darkman-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx".into()),
                tokens: Some("vits-piper-pl_PL-darkman-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-pl_PL-darkman-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-pl_PL-darkman-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx',
        tokens: 'vits-piper-pl_PL-darkman-medium/tokens.txt',
        dataDir: 'vits-piper-pl_PL-darkman-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Nieważne, za kogo walczysz, i tak popełnisz błąd';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-pl_PL-darkman-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx',
    tokens: 'vits-piper-pl_PL-darkman-medium/tokens.txt',
    dataDir: 'vits-piper-pl_PL-darkman-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Nieważne, za kogo walczysz, i tak popełnisz błąd', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-pl_PL-darkman-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-pl_PL-darkman-medium/tokens.txt",
    dataDir: "vits-piper-pl_PL-darkman-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-pl_PL-darkman-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pl_PL-darkman-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pl_PL-darkman-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-pl_PL-darkman-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx",
        tokens = "vits-piper-pl_PL-darkman-medium/tokens.txt",
        dataDir = "vits-piper-pl_PL-darkman-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Nieważne, za kogo walczysz, i tak popełnisz błąd",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-pl_PL-darkman-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx");
    vits.setTokens("vits-piper-pl_PL-darkman-medium/tokens.txt");
    vits.setDataDir("vits-piper-pl_PL-darkman-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-pl_PL-darkman-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-pl_PL-darkman-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-pl_PL-darkman-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Nieważne, za kogo walczysz, i tak popełnisz błąd', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-pl_PL-darkman-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx",
				Tokens:  "vits-piper-pl_PL-darkman-medium/tokens.txt",
				DataDir: "vits-piper-pl_PL-darkman-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Nieważne, za kogo walczysz, i tak popełnisz błąd"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Nieważne, za kogo walczysz, i tak popełnisz błąd

sample audios for different speakers are listed below:

Speaker 0

vits-piper-pl_PL-gosia-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/pl/pl_PL/gosia/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pl_PL-gosia-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-pl_PL-gosia-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx";
  config.model.vits.tokens = "vits-piper-pl_PL-gosia-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pl_PL-gosia-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pl_PL-gosia-medium.tar.bz2

You can use the following code to play with vits-piper-pl_PL-gosia-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx",
            data_dir="vits-piper-pl_PL-gosia-medium/espeak-ng-data",
            tokens="vits-piper-pl_PL-gosia-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nieważne, za kogo walczysz, i tak popełnisz błąd",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-pl_PL-gosia-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx";
  config.model.vits.tokens = "vits-piper-pl_PL-gosia-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pl_PL-gosia-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-pl_PL-gosia-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx".into()),
                tokens: Some("vits-piper-pl_PL-gosia-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-pl_PL-gosia-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-pl_PL-gosia-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx',
        tokens: 'vits-piper-pl_PL-gosia-medium/tokens.txt',
        dataDir: 'vits-piper-pl_PL-gosia-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Nieważne, za kogo walczysz, i tak popełnisz błąd';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-pl_PL-gosia-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx',
    tokens: 'vits-piper-pl_PL-gosia-medium/tokens.txt',
    dataDir: 'vits-piper-pl_PL-gosia-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Nieważne, za kogo walczysz, i tak popełnisz błąd', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-pl_PL-gosia-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-pl_PL-gosia-medium/tokens.txt",
    dataDir: "vits-piper-pl_PL-gosia-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-pl_PL-gosia-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pl_PL-gosia-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pl_PL-gosia-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-pl_PL-gosia-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx",
        tokens = "vits-piper-pl_PL-gosia-medium/tokens.txt",
        dataDir = "vits-piper-pl_PL-gosia-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Nieważne, za kogo walczysz, i tak popełnisz błąd",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-pl_PL-gosia-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx");
    vits.setTokens("vits-piper-pl_PL-gosia-medium/tokens.txt");
    vits.setDataDir("vits-piper-pl_PL-gosia-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-pl_PL-gosia-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-pl_PL-gosia-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-pl_PL-gosia-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Nieważne, za kogo walczysz, i tak popełnisz błąd', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-pl_PL-gosia-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx",
				Tokens:  "vits-piper-pl_PL-gosia-medium/tokens.txt",
				DataDir: "vits-piper-pl_PL-gosia-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Nieważne, za kogo walczysz, i tak popełnisz błąd"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Nieważne, za kogo walczysz, i tak popełnisz błąd

sample audios for different speakers are listed below:

Speaker 0

vits-piper-pl_PL-jarvis_wg_glos-medium

Info about this model

This model is converted from https://github.com/k2-fsa/sherpa-onnx/issues/2402

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pl_PL-jarvis_wg_glos-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx";
  config.model.vits.tokens = "vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pl_PL-jarvis_wg_glos-medium.tar.bz2

You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx",
            data_dir="vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data",
            tokens="vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nieważne, za kogo walczysz, i tak popełnisz błąd",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx";
  config.model.vits.tokens = "vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx".into()),
                tokens: Some("vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx',
        tokens: 'vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt',
        dataDir: 'vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Nieważne, za kogo walczysz, i tak popełnisz błąd';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx',
    tokens: 'vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt',
    dataDir: 'vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Nieważne, za kogo walczysz, i tak popełnisz błąd', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt",
    dataDir: "vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx",
        tokens = "vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt",
        dataDir = "vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Nieważne, za kogo walczysz, i tak popełnisz błąd",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx");
    vits.setTokens("vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt");
    vits.setDataDir("vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Nieważne, za kogo walczysz, i tak popełnisz błąd', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx",
				Tokens:  "vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt",
				DataDir: "vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Nieważne, za kogo walczysz, i tak popełnisz błąd"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Nieważne, za kogo walczysz, i tak popełnisz błąd

sample audios for different speakers are listed below:

Speaker 0

vits-piper-pl_PL-justyna_wg_glos-medium

Info about this model

This model is converted from https://github.com/k2-fsa/sherpa-onnx/issues/2402

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pl_PL-justyna_wg_glos-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx";
  config.model.vits.tokens = "vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pl_PL-justyna_wg_glos-medium.tar.bz2

You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx",
            data_dir="vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data",
            tokens="vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nieważne, za kogo walczysz, i tak popełnisz błąd",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx";
  config.model.vits.tokens = "vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx".into()),
                tokens: Some("vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx',
        tokens: 'vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt',
        dataDir: 'vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Nieważne, za kogo walczysz, i tak popełnisz błąd';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx',
    tokens: 'vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt',
    dataDir: 'vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Nieważne, za kogo walczysz, i tak popełnisz błąd', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt",
    dataDir: "vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx",
        tokens = "vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt",
        dataDir = "vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Nieważne, za kogo walczysz, i tak popełnisz błąd",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx");
    vits.setTokens("vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt");
    vits.setDataDir("vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Nieważne, za kogo walczysz, i tak popełnisz błąd', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx",
				Tokens:  "vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt",
				DataDir: "vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Nieważne, za kogo walczysz, i tak popełnisz błąd"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Nieważne, za kogo walczysz, i tak popełnisz błąd

sample audios for different speakers are listed below:

Speaker 0

vits-piper-pl_PL-mc_speech-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/pl/pl_PL/mc_speech/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pl_PL-mc_speech-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx";
  config.model.vits.tokens = "vits-piper-pl_PL-mc_speech-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pl_PL-mc_speech-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pl_PL-mc_speech-medium.tar.bz2

You can use the following code to play with vits-piper-pl_PL-mc_speech-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx",
            data_dir="vits-piper-pl_PL-mc_speech-medium/espeak-ng-data",
            tokens="vits-piper-pl_PL-mc_speech-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nieważne, za kogo walczysz, i tak popełnisz błąd",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx";
  config.model.vits.tokens = "vits-piper-pl_PL-mc_speech-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pl_PL-mc_speech-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx".into()),
                tokens: Some("vits-piper-pl_PL-mc_speech-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-pl_PL-mc_speech-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx',
        tokens: 'vits-piper-pl_PL-mc_speech-medium/tokens.txt',
        dataDir: 'vits-piper-pl_PL-mc_speech-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Nieważne, za kogo walczysz, i tak popełnisz błąd';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx',
    tokens: 'vits-piper-pl_PL-mc_speech-medium/tokens.txt',
    dataDir: 'vits-piper-pl_PL-mc_speech-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Nieważne, za kogo walczysz, i tak popełnisz błąd', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-pl_PL-mc_speech-medium/tokens.txt",
    dataDir: "vits-piper-pl_PL-mc_speech-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pl_PL-mc_speech-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pl_PL-mc_speech-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx",
        tokens = "vits-piper-pl_PL-mc_speech-medium/tokens.txt",
        dataDir = "vits-piper-pl_PL-mc_speech-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Nieważne, za kogo walczysz, i tak popełnisz błąd",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx");
    vits.setTokens("vits-piper-pl_PL-mc_speech-medium/tokens.txt");
    vits.setDataDir("vits-piper-pl_PL-mc_speech-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-pl_PL-mc_speech-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-pl_PL-mc_speech-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Nieważne, za kogo walczysz, i tak popełnisz błąd', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx",
				Tokens:  "vits-piper-pl_PL-mc_speech-medium/tokens.txt",
				DataDir: "vits-piper-pl_PL-mc_speech-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Nieważne, za kogo walczysz, i tak popełnisz błąd"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Nieważne, za kogo walczysz, i tak popełnisz błąd

sample audios for different speakers are listed below:

Speaker 0

vits-piper-pl_PL-meski_wg_glos-medium

Info about this model

This model is converted from https://github.com/k2-fsa/sherpa-onnx/issues/2402

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pl_PL-meski_wg_glos-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx";
  config.model.vits.tokens = "vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pl_PL-meski_wg_glos-medium.tar.bz2

You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx",
            data_dir="vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data",
            tokens="vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nieważne, za kogo walczysz, i tak popełnisz błąd",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx";
  config.model.vits.tokens = "vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx".into()),
                tokens: Some("vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx',
        tokens: 'vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt',
        dataDir: 'vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Nieważne, za kogo walczysz, i tak popełnisz błąd';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx',
    tokens: 'vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt',
    dataDir: 'vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Nieważne, za kogo walczysz, i tak popełnisz błąd', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt",
    dataDir: "vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx",
        tokens = "vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt",
        dataDir = "vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Nieważne, za kogo walczysz, i tak popełnisz błąd",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx");
    vits.setTokens("vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt");
    vits.setDataDir("vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Nieważne, za kogo walczysz, i tak popełnisz błąd', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx",
				Tokens:  "vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt",
				DataDir: "vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Nieważne, za kogo walczysz, i tak popełnisz błąd"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Nieważne, za kogo walczysz, i tak popełnisz błąd

sample audios for different speakers are listed below:

Speaker 0

vits-piper-pl_PL-zenski_wg_glos-medium

Info about this model

This model is converted from https://github.com/k2-fsa/sherpa-onnx/issues/2402

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pl_PL-zenski_wg_glos-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx";
  config.model.vits.tokens = "vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pl_PL-zenski_wg_glos-medium.tar.bz2

You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx",
            data_dir="vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data",
            tokens="vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nieważne, za kogo walczysz, i tak popełnisz błąd",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx";
  config.model.vits.tokens = "vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx".into()),
                tokens: Some("vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx',
        tokens: 'vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt',
        dataDir: 'vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Nieważne, za kogo walczysz, i tak popełnisz błąd';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx',
    tokens: 'vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt',
    dataDir: 'vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Nieważne, za kogo walczysz, i tak popełnisz błąd', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt",
    dataDir: "vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx",
        tokens = "vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt",
        dataDir = "vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Nieważne, za kogo walczysz, i tak popełnisz błąd",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx");
    vits.setTokens("vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt");
    vits.setDataDir("vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Nieważne, za kogo walczysz, i tak popełnisz błąd', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx",
				Tokens:  "vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt",
				DataDir: "vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Nieważne, za kogo walczysz, i tak popełnisz błąd"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Nieważne, za kogo walczysz, i tak popełnisz błąd

sample audios for different speakers are listed below:

Speaker 0

supertonic-3-pl

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Polish (pl).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "pl"

audio = tts.generate("Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"pl\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "pl"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "pl"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'pl'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'pl'},
  );
  final audio = tts.generateWithConfig(text: 'Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "pl"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"pl\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "pl"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"pl\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "pl"}';

  Audio := Tts.GenerateWithConfig('Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "pl"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Witaj świecie.

1

Jak się dziś masz?

2

Niebo jest niebieskie, a wiatr jest łagodny.

3

Uczenie maszynowe pomaga komputerom uczyć się z danych.

4

Synteza mowy zamienia tekst w wyraźny dźwięk.

5

Uczniowie przeczytali krótką historię w bibliotece.

6

Pociąg spóźnił się z powodu konserwacji torów.

7

Małe modele działają szybko na lokalnych urządzeniach.

8

Asystent głosowy pomaga w codziennych zadaniach.

9

Stabilne czytanie jest ważne dla krótkich i długich zdań.

Speaker 1

0

Witaj świecie.

1

Jak się dziś masz?

2

Niebo jest niebieskie, a wiatr jest łagodny.

3

Uczenie maszynowe pomaga komputerom uczyć się z danych.

4

Synteza mowy zamienia tekst w wyraźny dźwięk.

5

Uczniowie przeczytali krótką historię w bibliotece.

6

Pociąg spóźnił się z powodu konserwacji torów.

7

Małe modele działają szybko na lokalnych urządzeniach.

8

Asystent głosowy pomaga w codziennych zadaniach.

9

Stabilne czytanie jest ważne dla krótkich i długich zdań.

Speaker 2

0

Witaj świecie.

1

Jak się dziś masz?

2

Niebo jest niebieskie, a wiatr jest łagodny.

3

Uczenie maszynowe pomaga komputerom uczyć się z danych.

4

Synteza mowy zamienia tekst w wyraźny dźwięk.

5

Uczniowie przeczytali krótką historię w bibliotece.

6

Pociąg spóźnił się z powodu konserwacji torów.

7

Małe modele działają szybko na lokalnych urządzeniach.

8

Asystent głosowy pomaga w codziennych zadaniach.

9

Stabilne czytanie jest ważne dla krótkich i długich zdań.

Speaker 3

0

Witaj świecie.

1

Jak się dziś masz?

2

Niebo jest niebieskie, a wiatr jest łagodny.

3

Uczenie maszynowe pomaga komputerom uczyć się z danych.

4

Synteza mowy zamienia tekst w wyraźny dźwięk.

5

Uczniowie przeczytali krótką historię w bibliotece.

6

Pociąg spóźnił się z powodu konserwacji torów.

7

Małe modele działają szybko na lokalnych urządzeniach.

8

Asystent głosowy pomaga w codziennych zadaniach.

9

Stabilne czytanie jest ważne dla krótkich i długich zdań.

Speaker 4

0

Witaj świecie.

1

Jak się dziś masz?

2

Niebo jest niebieskie, a wiatr jest łagodny.

3

Uczenie maszynowe pomaga komputerom uczyć się z danych.

4

Synteza mowy zamienia tekst w wyraźny dźwięk.

5

Uczniowie przeczytali krótką historię w bibliotece.

6

Pociąg spóźnił się z powodu konserwacji torów.

7

Małe modele działają szybko na lokalnych urządzeniach.

8

Asystent głosowy pomaga w codziennych zadaniach.

9

Stabilne czytanie jest ważne dla krótkich i długich zdań.

Speaker 5

0

Witaj świecie.

1

Jak się dziś masz?

2

Niebo jest niebieskie, a wiatr jest łagodny.

3

Uczenie maszynowe pomaga komputerom uczyć się z danych.

4

Synteza mowy zamienia tekst w wyraźny dźwięk.

5

Uczniowie przeczytali krótką historię w bibliotece.

6

Pociąg spóźnił się z powodu konserwacji torów.

7

Małe modele działają szybko na lokalnych urządzeniach.

8

Asystent głosowy pomaga w codziennych zadaniach.

9

Stabilne czytanie jest ważne dla krótkich i długich zdań.

Speaker 6

0

Witaj świecie.

1

Jak się dziś masz?

2

Niebo jest niebieskie, a wiatr jest łagodny.

3

Uczenie maszynowe pomaga komputerom uczyć się z danych.

4

Synteza mowy zamienia tekst w wyraźny dźwięk.

5

Uczniowie przeczytali krótką historię w bibliotece.

6

Pociąg spóźnił się z powodu konserwacji torów.

7

Małe modele działają szybko na lokalnych urządzeniach.

8

Asystent głosowy pomaga w codziennych zadaniach.

9

Stabilne czytanie jest ważne dla krótkich i długich zdań.

Speaker 7

0

Witaj świecie.

1

Jak się dziś masz?

2

Niebo jest niebieskie, a wiatr jest łagodny.

3

Uczenie maszynowe pomaga komputerom uczyć się z danych.

4

Synteza mowy zamienia tekst w wyraźny dźwięk.

5

Uczniowie przeczytali krótką historię w bibliotece.

6

Pociąg spóźnił się z powodu konserwacji torów.

7

Małe modele działają szybko na lokalnych urządzeniach.

8

Asystent głosowy pomaga w codziennych zadaniach.

9

Stabilne czytanie jest ważne dla krótkich i długich zdań.

Speaker 8

0

Witaj świecie.

1

Jak się dziś masz?

2

Niebo jest niebieskie, a wiatr jest łagodny.

3

Uczenie maszynowe pomaga komputerom uczyć się z danych.

4

Synteza mowy zamienia tekst w wyraźny dźwięk.

5

Uczniowie przeczytali krótką historię w bibliotece.

6

Pociąg spóźnił się z powodu konserwacji torów.

7

Małe modele działają szybko na lokalnych urządzeniach.

8

Asystent głosowy pomaga w codziennych zadaniach.

9

Stabilne czytanie jest ważne dla krótkich i długich zdań.

Speaker 9

0

Witaj świecie.

1

Jak się dziś masz?

2

Niebo jest niebieskie, a wiatr jest łagodny.

3

Uczenie maszynowe pomaga komputerom uczyć się z danych.

4

Synteza mowy zamienia tekst w wyraźny dźwięk.

5

Uczniowie przeczytali krótką historię w bibliotece.

6

Pociąg spóźnił się z powodu konserwacji torów.

7

Małe modele działają szybko na lokalnych urządzeniach.

8

Asystent głosowy pomaga w codziennych zadaniach.

9

Stabilne czytanie jest ważne dla krótkich i długich zdań.

Portuguese

This section lists text to speech models for Portuguese.

vits-piper-pt_BR-cadu-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/pt/pt_BR/cadu/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_BR-cadu-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-pt_BR-cadu-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx";
  config.model.vits.tokens = "vits-piper-pt_BR-cadu-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pt_BR-cadu-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Marinha sem vento, não chega a porto";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_BR-cadu-medium.tar.bz2

You can use the following code to play with vits-piper-pt_BR-cadu-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx",
            data_dir="vits-piper-pt_BR-cadu-medium/espeak-ng-data",
            tokens="vits-piper-pt_BR-cadu-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Marinha sem vento, não chega a porto",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-pt_BR-cadu-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx";
  config.model.vits.tokens = "vits-piper-pt_BR-cadu-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pt_BR-cadu-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Marinha sem vento, não chega a porto";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-pt_BR-cadu-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx".into()),
                tokens: Some("vits-piper-pt_BR-cadu-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-pt_BR-cadu-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Marinha sem vento, não chega a porto";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-pt_BR-cadu-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx',
        tokens: 'vits-piper-pt_BR-cadu-medium/tokens.txt',
        dataDir: 'vits-piper-pt_BR-cadu-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Marinha sem vento, não chega a porto';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-pt_BR-cadu-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx',
    tokens: 'vits-piper-pt_BR-cadu-medium/tokens.txt',
    dataDir: 'vits-piper-pt_BR-cadu-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Marinha sem vento, não chega a porto', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-pt_BR-cadu-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-pt_BR-cadu-medium/tokens.txt",
    dataDir: "vits-piper-pt_BR-cadu-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Marinha sem vento, não chega a porto"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-pt_BR-cadu-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pt_BR-cadu-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pt_BR-cadu-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-pt_BR-cadu-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx",
        tokens = "vits-piper-pt_BR-cadu-medium/tokens.txt",
        dataDir = "vits-piper-pt_BR-cadu-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Marinha sem vento, não chega a porto",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-pt_BR-cadu-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx");
    vits.setTokens("vits-piper-pt_BR-cadu-medium/tokens.txt");
    vits.setDataDir("vits-piper-pt_BR-cadu-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Marinha sem vento, não chega a porto";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-pt_BR-cadu-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-pt_BR-cadu-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-pt_BR-cadu-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Marinha sem vento, não chega a porto', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-pt_BR-cadu-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx",
				Tokens:  "vits-piper-pt_BR-cadu-medium/tokens.txt",
				DataDir: "vits-piper-pt_BR-cadu-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Marinha sem vento, não chega a porto"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Marinha sem vento, não chega a porto

sample audios for different speakers are listed below:

Speaker 0

vits-piper-pt_BR-dii-high

Info about this model

This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_pt-BR_dii

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_BR-dii-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-pt_BR-dii-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx";
  config.model.vits.tokens = "vits-piper-pt_BR-dii-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pt_BR-dii-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Marinha sem vento, não chega a porto";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_BR-dii-high.tar.bz2

You can use the following code to play with vits-piper-pt_BR-dii-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx",
            data_dir="vits-piper-pt_BR-dii-high/espeak-ng-data",
            tokens="vits-piper-pt_BR-dii-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Marinha sem vento, não chega a porto",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-pt_BR-dii-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx";
  config.model.vits.tokens = "vits-piper-pt_BR-dii-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pt_BR-dii-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Marinha sem vento, não chega a porto";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-pt_BR-dii-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx".into()),
                tokens: Some("vits-piper-pt_BR-dii-high/tokens.txt".into()),
                data_dir: Some("vits-piper-pt_BR-dii-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Marinha sem vento, não chega a porto";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-pt_BR-dii-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx',
        tokens: 'vits-piper-pt_BR-dii-high/tokens.txt',
        dataDir: 'vits-piper-pt_BR-dii-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Marinha sem vento, não chega a porto';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-pt_BR-dii-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx',
    tokens: 'vits-piper-pt_BR-dii-high/tokens.txt',
    dataDir: 'vits-piper-pt_BR-dii-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Marinha sem vento, não chega a porto', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-pt_BR-dii-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx",
    lexicon: "",
    tokens: "vits-piper-pt_BR-dii-high/tokens.txt",
    dataDir: "vits-piper-pt_BR-dii-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Marinha sem vento, não chega a porto"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-pt_BR-dii-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx";
config.Model.Vits.Tokens = "vits-piper-pt_BR-dii-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pt_BR-dii-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-pt_BR-dii-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx",
        tokens = "vits-piper-pt_BR-dii-high/tokens.txt",
        dataDir = "vits-piper-pt_BR-dii-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Marinha sem vento, não chega a porto",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-pt_BR-dii-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx");
    vits.setTokens("vits-piper-pt_BR-dii-high/tokens.txt");
    vits.setDataDir("vits-piper-pt_BR-dii-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Marinha sem vento, não chega a porto";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-pt_BR-dii-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-pt_BR-dii-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-pt_BR-dii-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Marinha sem vento, não chega a porto', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-pt_BR-dii-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx",
				Tokens:  "vits-piper-pt_BR-dii-high/tokens.txt",
				DataDir: "vits-piper-pt_BR-dii-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Marinha sem vento, não chega a porto"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Marinha sem vento, não chega a porto

sample audios for different speakers are listed below:

Speaker 0

vits-piper-pt_BR-edresson-low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/pt/pt_BR/edresson/low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_BR-edresson-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-pt_BR-edresson-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx";
  config.model.vits.tokens = "vits-piper-pt_BR-edresson-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pt_BR-edresson-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Marinha sem vento, não chega a porto";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_BR-edresson-low.tar.bz2

You can use the following code to play with vits-piper-pt_BR-edresson-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx",
            data_dir="vits-piper-pt_BR-edresson-low/espeak-ng-data",
            tokens="vits-piper-pt_BR-edresson-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Marinha sem vento, não chega a porto",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-pt_BR-edresson-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx";
  config.model.vits.tokens = "vits-piper-pt_BR-edresson-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pt_BR-edresson-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Marinha sem vento, não chega a porto";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-pt_BR-edresson-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx".into()),
                tokens: Some("vits-piper-pt_BR-edresson-low/tokens.txt".into()),
                data_dir: Some("vits-piper-pt_BR-edresson-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Marinha sem vento, não chega a porto";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-pt_BR-edresson-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx',
        tokens: 'vits-piper-pt_BR-edresson-low/tokens.txt',
        dataDir: 'vits-piper-pt_BR-edresson-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Marinha sem vento, não chega a porto';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-pt_BR-edresson-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx',
    tokens: 'vits-piper-pt_BR-edresson-low/tokens.txt',
    dataDir: 'vits-piper-pt_BR-edresson-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Marinha sem vento, não chega a porto', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-pt_BR-edresson-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx",
    lexicon: "",
    tokens: "vits-piper-pt_BR-edresson-low/tokens.txt",
    dataDir: "vits-piper-pt_BR-edresson-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Marinha sem vento, não chega a porto"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-pt_BR-edresson-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx";
config.Model.Vits.Tokens = "vits-piper-pt_BR-edresson-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pt_BR-edresson-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-pt_BR-edresson-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx",
        tokens = "vits-piper-pt_BR-edresson-low/tokens.txt",
        dataDir = "vits-piper-pt_BR-edresson-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Marinha sem vento, não chega a porto",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-pt_BR-edresson-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx");
    vits.setTokens("vits-piper-pt_BR-edresson-low/tokens.txt");
    vits.setDataDir("vits-piper-pt_BR-edresson-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Marinha sem vento, não chega a porto";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-pt_BR-edresson-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-pt_BR-edresson-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-pt_BR-edresson-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Marinha sem vento, não chega a porto', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-pt_BR-edresson-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx",
				Tokens:  "vits-piper-pt_BR-edresson-low/tokens.txt",
				DataDir: "vits-piper-pt_BR-edresson-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Marinha sem vento, não chega a porto"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Marinha sem vento, não chega a porto

sample audios for different speakers are listed below:

Speaker 0

vits-piper-pt_BR-faber-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/pt/pt_BR/faber/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_BR-faber-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-pt_BR-faber-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx";
  config.model.vits.tokens = "vits-piper-pt_BR-faber-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pt_BR-faber-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Marinha sem vento, não chega a porto";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_BR-faber-medium.tar.bz2

You can use the following code to play with vits-piper-pt_BR-faber-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx",
            data_dir="vits-piper-pt_BR-faber-medium/espeak-ng-data",
            tokens="vits-piper-pt_BR-faber-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Marinha sem vento, não chega a porto",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-pt_BR-faber-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx";
  config.model.vits.tokens = "vits-piper-pt_BR-faber-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pt_BR-faber-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Marinha sem vento, não chega a porto";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-pt_BR-faber-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx".into()),
                tokens: Some("vits-piper-pt_BR-faber-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-pt_BR-faber-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Marinha sem vento, não chega a porto";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-pt_BR-faber-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx',
        tokens: 'vits-piper-pt_BR-faber-medium/tokens.txt',
        dataDir: 'vits-piper-pt_BR-faber-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Marinha sem vento, não chega a porto';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-pt_BR-faber-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx',
    tokens: 'vits-piper-pt_BR-faber-medium/tokens.txt',
    dataDir: 'vits-piper-pt_BR-faber-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Marinha sem vento, não chega a porto', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-pt_BR-faber-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-pt_BR-faber-medium/tokens.txt",
    dataDir: "vits-piper-pt_BR-faber-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Marinha sem vento, não chega a porto"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-pt_BR-faber-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pt_BR-faber-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pt_BR-faber-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-pt_BR-faber-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx",
        tokens = "vits-piper-pt_BR-faber-medium/tokens.txt",
        dataDir = "vits-piper-pt_BR-faber-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Marinha sem vento, não chega a porto",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-pt_BR-faber-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx");
    vits.setTokens("vits-piper-pt_BR-faber-medium/tokens.txt");
    vits.setDataDir("vits-piper-pt_BR-faber-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Marinha sem vento, não chega a porto";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-pt_BR-faber-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-pt_BR-faber-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-pt_BR-faber-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Marinha sem vento, não chega a porto', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-pt_BR-faber-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx",
				Tokens:  "vits-piper-pt_BR-faber-medium/tokens.txt",
				DataDir: "vits-piper-pt_BR-faber-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Marinha sem vento, não chega a porto"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Marinha sem vento, não chega a porto

sample audios for different speakers are listed below:

Speaker 0

vits-piper-pt_BR-jeff-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/pt/pt_BR/jeff/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_BR-jeff-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-pt_BR-jeff-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx";
  config.model.vits.tokens = "vits-piper-pt_BR-jeff-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pt_BR-jeff-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Marinha sem vento, não chega a porto";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_BR-jeff-medium.tar.bz2

You can use the following code to play with vits-piper-pt_BR-jeff-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx",
            data_dir="vits-piper-pt_BR-jeff-medium/espeak-ng-data",
            tokens="vits-piper-pt_BR-jeff-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Marinha sem vento, não chega a porto",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-pt_BR-jeff-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx";
  config.model.vits.tokens = "vits-piper-pt_BR-jeff-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pt_BR-jeff-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Marinha sem vento, não chega a porto";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-pt_BR-jeff-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx".into()),
                tokens: Some("vits-piper-pt_BR-jeff-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-pt_BR-jeff-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Marinha sem vento, não chega a porto";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-pt_BR-jeff-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx',
        tokens: 'vits-piper-pt_BR-jeff-medium/tokens.txt',
        dataDir: 'vits-piper-pt_BR-jeff-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Marinha sem vento, não chega a porto';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-pt_BR-jeff-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx',
    tokens: 'vits-piper-pt_BR-jeff-medium/tokens.txt',
    dataDir: 'vits-piper-pt_BR-jeff-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Marinha sem vento, não chega a porto', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-pt_BR-jeff-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-pt_BR-jeff-medium/tokens.txt",
    dataDir: "vits-piper-pt_BR-jeff-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Marinha sem vento, não chega a porto"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-pt_BR-jeff-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pt_BR-jeff-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pt_BR-jeff-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-pt_BR-jeff-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx",
        tokens = "vits-piper-pt_BR-jeff-medium/tokens.txt",
        dataDir = "vits-piper-pt_BR-jeff-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Marinha sem vento, não chega a porto",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-pt_BR-jeff-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx");
    vits.setTokens("vits-piper-pt_BR-jeff-medium/tokens.txt");
    vits.setDataDir("vits-piper-pt_BR-jeff-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Marinha sem vento, não chega a porto";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-pt_BR-jeff-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-pt_BR-jeff-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-pt_BR-jeff-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Marinha sem vento, não chega a porto', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-pt_BR-jeff-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx",
				Tokens:  "vits-piper-pt_BR-jeff-medium/tokens.txt",
				DataDir: "vits-piper-pt_BR-jeff-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Marinha sem vento, não chega a porto"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Marinha sem vento, não chega a porto

sample audios for different speakers are listed below:

Speaker 0

vits-piper-pt_BR-miro-high

Info about this model

This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_pt-BR_miro

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_BR-miro-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-pt_BR-miro-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx";
  config.model.vits.tokens = "vits-piper-pt_BR-miro-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pt_BR-miro-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Marinha sem vento, não chega a porto";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_BR-miro-high.tar.bz2

You can use the following code to play with vits-piper-pt_BR-miro-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx",
            data_dir="vits-piper-pt_BR-miro-high/espeak-ng-data",
            tokens="vits-piper-pt_BR-miro-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Marinha sem vento, não chega a porto",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-pt_BR-miro-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx";
  config.model.vits.tokens = "vits-piper-pt_BR-miro-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pt_BR-miro-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Marinha sem vento, não chega a porto";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-pt_BR-miro-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx".into()),
                tokens: Some("vits-piper-pt_BR-miro-high/tokens.txt".into()),
                data_dir: Some("vits-piper-pt_BR-miro-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Marinha sem vento, não chega a porto";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-pt_BR-miro-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx',
        tokens: 'vits-piper-pt_BR-miro-high/tokens.txt',
        dataDir: 'vits-piper-pt_BR-miro-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Marinha sem vento, não chega a porto';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-pt_BR-miro-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx',
    tokens: 'vits-piper-pt_BR-miro-high/tokens.txt',
    dataDir: 'vits-piper-pt_BR-miro-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Marinha sem vento, não chega a porto', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-pt_BR-miro-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx",
    lexicon: "",
    tokens: "vits-piper-pt_BR-miro-high/tokens.txt",
    dataDir: "vits-piper-pt_BR-miro-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Marinha sem vento, não chega a porto"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-pt_BR-miro-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx";
config.Model.Vits.Tokens = "vits-piper-pt_BR-miro-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pt_BR-miro-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-pt_BR-miro-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx",
        tokens = "vits-piper-pt_BR-miro-high/tokens.txt",
        dataDir = "vits-piper-pt_BR-miro-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Marinha sem vento, não chega a porto",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-pt_BR-miro-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx");
    vits.setTokens("vits-piper-pt_BR-miro-high/tokens.txt");
    vits.setDataDir("vits-piper-pt_BR-miro-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Marinha sem vento, não chega a porto";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-pt_BR-miro-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-pt_BR-miro-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-pt_BR-miro-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Marinha sem vento, não chega a porto', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-pt_BR-miro-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx",
				Tokens:  "vits-piper-pt_BR-miro-high/tokens.txt",
				DataDir: "vits-piper-pt_BR-miro-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Marinha sem vento, não chega a porto"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Marinha sem vento, não chega a porto

sample audios for different speakers are listed below:

Speaker 0

vits-piper-pt_PT-dii-high

Info about this model

This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_pt-PT_dii

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_PT-dii-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-pt_PT-dii-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx";
  config.model.vits.tokens = "vits-piper-pt_PT-dii-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pt_PT-dii-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Marinha sem vento, não chega a porto";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_PT-dii-high.tar.bz2

You can use the following code to play with vits-piper-pt_PT-dii-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx",
            data_dir="vits-piper-pt_PT-dii-high/espeak-ng-data",
            tokens="vits-piper-pt_PT-dii-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Marinha sem vento, não chega a porto",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-pt_PT-dii-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx";
  config.model.vits.tokens = "vits-piper-pt_PT-dii-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pt_PT-dii-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Marinha sem vento, não chega a porto";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-pt_PT-dii-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx".into()),
                tokens: Some("vits-piper-pt_PT-dii-high/tokens.txt".into()),
                data_dir: Some("vits-piper-pt_PT-dii-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Marinha sem vento, não chega a porto";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-pt_PT-dii-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx',
        tokens: 'vits-piper-pt_PT-dii-high/tokens.txt',
        dataDir: 'vits-piper-pt_PT-dii-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Marinha sem vento, não chega a porto';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-pt_PT-dii-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx',
    tokens: 'vits-piper-pt_PT-dii-high/tokens.txt',
    dataDir: 'vits-piper-pt_PT-dii-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Marinha sem vento, não chega a porto', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-pt_PT-dii-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx",
    lexicon: "",
    tokens: "vits-piper-pt_PT-dii-high/tokens.txt",
    dataDir: "vits-piper-pt_PT-dii-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Marinha sem vento, não chega a porto"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-pt_PT-dii-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx";
config.Model.Vits.Tokens = "vits-piper-pt_PT-dii-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pt_PT-dii-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-pt_PT-dii-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx",
        tokens = "vits-piper-pt_PT-dii-high/tokens.txt",
        dataDir = "vits-piper-pt_PT-dii-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Marinha sem vento, não chega a porto",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-pt_PT-dii-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx");
    vits.setTokens("vits-piper-pt_PT-dii-high/tokens.txt");
    vits.setDataDir("vits-piper-pt_PT-dii-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Marinha sem vento, não chega a porto";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-pt_PT-dii-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-pt_PT-dii-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-pt_PT-dii-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Marinha sem vento, não chega a porto', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-pt_PT-dii-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx",
				Tokens:  "vits-piper-pt_PT-dii-high/tokens.txt",
				DataDir: "vits-piper-pt_PT-dii-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Marinha sem vento, não chega a porto"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Marinha sem vento, não chega a porto

sample audios for different speakers are listed below:

Speaker 0

vits-piper-pt_PT-miro-high

Info about this model

This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_pt-PT_miro

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_PT-miro-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-pt_PT-miro-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx";
  config.model.vits.tokens = "vits-piper-pt_PT-miro-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pt_PT-miro-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Marinha sem vento, não chega a porto";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_PT-miro-high.tar.bz2

You can use the following code to play with vits-piper-pt_PT-miro-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx",
            data_dir="vits-piper-pt_PT-miro-high/espeak-ng-data",
            tokens="vits-piper-pt_PT-miro-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Marinha sem vento, não chega a porto",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-pt_PT-miro-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx";
  config.model.vits.tokens = "vits-piper-pt_PT-miro-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pt_PT-miro-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Marinha sem vento, não chega a porto";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-pt_PT-miro-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx".into()),
                tokens: Some("vits-piper-pt_PT-miro-high/tokens.txt".into()),
                data_dir: Some("vits-piper-pt_PT-miro-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Marinha sem vento, não chega a porto";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-pt_PT-miro-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx',
        tokens: 'vits-piper-pt_PT-miro-high/tokens.txt',
        dataDir: 'vits-piper-pt_PT-miro-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Marinha sem vento, não chega a porto';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-pt_PT-miro-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx',
    tokens: 'vits-piper-pt_PT-miro-high/tokens.txt',
    dataDir: 'vits-piper-pt_PT-miro-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Marinha sem vento, não chega a porto', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-pt_PT-miro-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx",
    lexicon: "",
    tokens: "vits-piper-pt_PT-miro-high/tokens.txt",
    dataDir: "vits-piper-pt_PT-miro-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Marinha sem vento, não chega a porto"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-pt_PT-miro-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx";
config.Model.Vits.Tokens = "vits-piper-pt_PT-miro-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pt_PT-miro-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-pt_PT-miro-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx",
        tokens = "vits-piper-pt_PT-miro-high/tokens.txt",
        dataDir = "vits-piper-pt_PT-miro-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Marinha sem vento, não chega a porto",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-pt_PT-miro-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx");
    vits.setTokens("vits-piper-pt_PT-miro-high/tokens.txt");
    vits.setDataDir("vits-piper-pt_PT-miro-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Marinha sem vento, não chega a porto";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-pt_PT-miro-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-pt_PT-miro-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-pt_PT-miro-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Marinha sem vento, não chega a porto', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-pt_PT-miro-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx",
				Tokens:  "vits-piper-pt_PT-miro-high/tokens.txt",
				DataDir: "vits-piper-pt_PT-miro-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Marinha sem vento, não chega a porto"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Marinha sem vento, não chega a porto

sample audios for different speakers are listed below:

Speaker 0

vits-piper-pt_PT-tugao-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/pt/pt_PT/tugão/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_PT-tugao-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-pt_PT-tugao-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx";
  config.model.vits.tokens = "vits-piper-pt_PT-tugao-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pt_PT-tugao-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Marinha sem vento, não chega a porto";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_PT-tugao-medium.tar.bz2

You can use the following code to play with vits-piper-pt_PT-tugao-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx",
            data_dir="vits-piper-pt_PT-tugao-medium/espeak-ng-data",
            tokens="vits-piper-pt_PT-tugao-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Marinha sem vento, não chega a porto",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-pt_PT-tugao-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx";
  config.model.vits.tokens = "vits-piper-pt_PT-tugao-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-pt_PT-tugao-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Marinha sem vento, não chega a porto";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-pt_PT-tugao-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx".into()),
                tokens: Some("vits-piper-pt_PT-tugao-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-pt_PT-tugao-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Marinha sem vento, não chega a porto";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-pt_PT-tugao-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx',
        tokens: 'vits-piper-pt_PT-tugao-medium/tokens.txt',
        dataDir: 'vits-piper-pt_PT-tugao-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Marinha sem vento, não chega a porto';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-pt_PT-tugao-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx',
    tokens: 'vits-piper-pt_PT-tugao-medium/tokens.txt',
    dataDir: 'vits-piper-pt_PT-tugao-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Marinha sem vento, não chega a porto', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-pt_PT-tugao-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-pt_PT-tugao-medium/tokens.txt",
    dataDir: "vits-piper-pt_PT-tugao-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Marinha sem vento, não chega a porto"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-pt_PT-tugao-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pt_PT-tugao-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pt_PT-tugao-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-pt_PT-tugao-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx",
        tokens = "vits-piper-pt_PT-tugao-medium/tokens.txt",
        dataDir = "vits-piper-pt_PT-tugao-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Marinha sem vento, não chega a porto",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-pt_PT-tugao-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx");
    vits.setTokens("vits-piper-pt_PT-tugao-medium/tokens.txt");
    vits.setDataDir("vits-piper-pt_PT-tugao-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Marinha sem vento, não chega a porto";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-pt_PT-tugao-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-pt_PT-tugao-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-pt_PT-tugao-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Marinha sem vento, não chega a porto', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-pt_PT-tugao-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx",
				Tokens:  "vits-piper-pt_PT-tugao-medium/tokens.txt",
				DataDir: "vits-piper-pt_PT-tugao-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Marinha sem vento, não chega a porto"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Marinha sem vento, não chega a porto

sample audios for different speakers are listed below:

Speaker 0

supertonic-3-pt

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Portuguese (pt).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "pt"

audio = tts.generate("Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"pt\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "pt"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "pt"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'pt'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'pt'},
  );
  final audio = tts.generateWithConfig(text: 'Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "pt"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"pt\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "pt"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"pt\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "pt"}';

  Audio := Tts.GenerateWithConfig('Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "pt"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Olá mundo.

1

Como você está hoje?

2

O céu é azul.

3

Eu amo aprendizado de máquina.

4

Python é incrível.

5

Bom dia a todos.

6

A inteligência artificial está crescendo.

7

A síntese de voz é fascinante.

8

As redes neurais são poderosas.

9

Texto para voz converte texto em áudio.

10

A rápida raposa marrom salta sobre o cachorro preguiçoso.

11

O aprendizado de máquina permite que computadores aprendam.

12

O processamento de linguagem natural ajuda máquinas a entender.

13

O aprendizado profundo revolucionou a inteligência artificial.

14

A tecnologia de síntese de voz avançou significativamente.

15

A clonagem de voz neural pode replicar estilos de fala.

16

A normalização de texto é importante para pronúncia.

17

Assistentes de voz nos ajudam a interagir com tecnologia.

18

Sistemas TTS modernos usam aprendizado profundo para áudio.

19

A interação humano computador tornou-se mais intuitiva.

Speaker 1

0

Olá mundo.

1

Como você está hoje?

2

O céu é azul.

3

Eu amo aprendizado de máquina.

4

Python é incrível.

5

Bom dia a todos.

6

A inteligência artificial está crescendo.

7

A síntese de voz é fascinante.

8

As redes neurais são poderosas.

9

Texto para voz converte texto em áudio.

10

A rápida raposa marrom salta sobre o cachorro preguiçoso.

11

O aprendizado de máquina permite que computadores aprendam.

12

O processamento de linguagem natural ajuda máquinas a entender.

13

O aprendizado profundo revolucionou a inteligência artificial.

14

A tecnologia de síntese de voz avançou significativamente.

15

A clonagem de voz neural pode replicar estilos de fala.

16

A normalização de texto é importante para pronúncia.

17

Assistentes de voz nos ajudam a interagir com tecnologia.

18

Sistemas TTS modernos usam aprendizado profundo para áudio.

19

A interação humano computador tornou-se mais intuitiva.

Speaker 2

0

Olá mundo.

1

Como você está hoje?

2

O céu é azul.

3

Eu amo aprendizado de máquina.

4

Python é incrível.

5

Bom dia a todos.

6

A inteligência artificial está crescendo.

7

A síntese de voz é fascinante.

8

As redes neurais são poderosas.

9

Texto para voz converte texto em áudio.

10

A rápida raposa marrom salta sobre o cachorro preguiçoso.

11

O aprendizado de máquina permite que computadores aprendam.

12

O processamento de linguagem natural ajuda máquinas a entender.

13

O aprendizado profundo revolucionou a inteligência artificial.

14

A tecnologia de síntese de voz avançou significativamente.

15

A clonagem de voz neural pode replicar estilos de fala.

16

A normalização de texto é importante para pronúncia.

17

Assistentes de voz nos ajudam a interagir com tecnologia.

18

Sistemas TTS modernos usam aprendizado profundo para áudio.

19

A interação humano computador tornou-se mais intuitiva.

Speaker 3

0

Olá mundo.

1

Como você está hoje?

2

O céu é azul.

3

Eu amo aprendizado de máquina.

4

Python é incrível.

5

Bom dia a todos.

6

A inteligência artificial está crescendo.

7

A síntese de voz é fascinante.

8

As redes neurais são poderosas.

9

Texto para voz converte texto em áudio.

10

A rápida raposa marrom salta sobre o cachorro preguiçoso.

11

O aprendizado de máquina permite que computadores aprendam.

12

O processamento de linguagem natural ajuda máquinas a entender.

13

O aprendizado profundo revolucionou a inteligência artificial.

14

A tecnologia de síntese de voz avançou significativamente.

15

A clonagem de voz neural pode replicar estilos de fala.

16

A normalização de texto é importante para pronúncia.

17

Assistentes de voz nos ajudam a interagir com tecnologia.

18

Sistemas TTS modernos usam aprendizado profundo para áudio.

19

A interação humano computador tornou-se mais intuitiva.

Speaker 4

0

Olá mundo.

1

Como você está hoje?

2

O céu é azul.

3

Eu amo aprendizado de máquina.

4

Python é incrível.

5

Bom dia a todos.

6

A inteligência artificial está crescendo.

7

A síntese de voz é fascinante.

8

As redes neurais são poderosas.

9

Texto para voz converte texto em áudio.

10

A rápida raposa marrom salta sobre o cachorro preguiçoso.

11

O aprendizado de máquina permite que computadores aprendam.

12

O processamento de linguagem natural ajuda máquinas a entender.

13

O aprendizado profundo revolucionou a inteligência artificial.

14

A tecnologia de síntese de voz avançou significativamente.

15

A clonagem de voz neural pode replicar estilos de fala.

16

A normalização de texto é importante para pronúncia.

17

Assistentes de voz nos ajudam a interagir com tecnologia.

18

Sistemas TTS modernos usam aprendizado profundo para áudio.

19

A interação humano computador tornou-se mais intuitiva.

Speaker 5

0

Olá mundo.

1

Como você está hoje?

2

O céu é azul.

3

Eu amo aprendizado de máquina.

4

Python é incrível.

5

Bom dia a todos.

6

A inteligência artificial está crescendo.

7

A síntese de voz é fascinante.

8

As redes neurais são poderosas.

9

Texto para voz converte texto em áudio.

10

A rápida raposa marrom salta sobre o cachorro preguiçoso.

11

O aprendizado de máquina permite que computadores aprendam.

12

O processamento de linguagem natural ajuda máquinas a entender.

13

O aprendizado profundo revolucionou a inteligência artificial.

14

A tecnologia de síntese de voz avançou significativamente.

15

A clonagem de voz neural pode replicar estilos de fala.

16

A normalização de texto é importante para pronúncia.

17

Assistentes de voz nos ajudam a interagir com tecnologia.

18

Sistemas TTS modernos usam aprendizado profundo para áudio.

19

A interação humano computador tornou-se mais intuitiva.

Speaker 6

0

Olá mundo.

1

Como você está hoje?

2

O céu é azul.

3

Eu amo aprendizado de máquina.

4

Python é incrível.

5

Bom dia a todos.

6

A inteligência artificial está crescendo.

7

A síntese de voz é fascinante.

8

As redes neurais são poderosas.

9

Texto para voz converte texto em áudio.

10

A rápida raposa marrom salta sobre o cachorro preguiçoso.

11

O aprendizado de máquina permite que computadores aprendam.

12

O processamento de linguagem natural ajuda máquinas a entender.

13

O aprendizado profundo revolucionou a inteligência artificial.

14

A tecnologia de síntese de voz avançou significativamente.

15

A clonagem de voz neural pode replicar estilos de fala.

16

A normalização de texto é importante para pronúncia.

17

Assistentes de voz nos ajudam a interagir com tecnologia.

18

Sistemas TTS modernos usam aprendizado profundo para áudio.

19

A interação humano computador tornou-se mais intuitiva.

Speaker 7

0

Olá mundo.

1

Como você está hoje?

2

O céu é azul.

3

Eu amo aprendizado de máquina.

4

Python é incrível.

5

Bom dia a todos.

6

A inteligência artificial está crescendo.

7

A síntese de voz é fascinante.

8

As redes neurais são poderosas.

9

Texto para voz converte texto em áudio.

10

A rápida raposa marrom salta sobre o cachorro preguiçoso.

11

O aprendizado de máquina permite que computadores aprendam.

12

O processamento de linguagem natural ajuda máquinas a entender.

13

O aprendizado profundo revolucionou a inteligência artificial.

14

A tecnologia de síntese de voz avançou significativamente.

15

A clonagem de voz neural pode replicar estilos de fala.

16

A normalização de texto é importante para pronúncia.

17

Assistentes de voz nos ajudam a interagir com tecnologia.

18

Sistemas TTS modernos usam aprendizado profundo para áudio.

19

A interação humano computador tornou-se mais intuitiva.

Speaker 8

0

Olá mundo.

1

Como você está hoje?

2

O céu é azul.

3

Eu amo aprendizado de máquina.

4

Python é incrível.

5

Bom dia a todos.

6

A inteligência artificial está crescendo.

7

A síntese de voz é fascinante.

8

As redes neurais são poderosas.

9

Texto para voz converte texto em áudio.

10

A rápida raposa marrom salta sobre o cachorro preguiçoso.

11

O aprendizado de máquina permite que computadores aprendam.

12

O processamento de linguagem natural ajuda máquinas a entender.

13

O aprendizado profundo revolucionou a inteligência artificial.

14

A tecnologia de síntese de voz avançou significativamente.

15

A clonagem de voz neural pode replicar estilos de fala.

16

A normalização de texto é importante para pronúncia.

17

Assistentes de voz nos ajudam a interagir com tecnologia.

18

Sistemas TTS modernos usam aprendizado profundo para áudio.

19

A interação humano computador tornou-se mais intuitiva.

Speaker 9

0

Olá mundo.

1

Como você está hoje?

2

O céu é azul.

3

Eu amo aprendizado de máquina.

4

Python é incrível.

5

Bom dia a todos.

6

A inteligência artificial está crescendo.

7

A síntese de voz é fascinante.

8

As redes neurais são poderosas.

9

Texto para voz converte texto em áudio.

10

A rápida raposa marrom salta sobre o cachorro preguiçoso.

11

O aprendizado de máquina permite que computadores aprendam.

12

O processamento de linguagem natural ajuda máquinas a entender.

13

O aprendizado profundo revolucionou a inteligência artificial.

14

A tecnologia de síntese de voz avançou significativamente.

15

A clonagem de voz neural pode replicar estilos de fala.

16

A normalização de texto é importante para pronúncia.

17

Assistentes de voz nos ajudam a interagir com tecnologia.

18

Sistemas TTS modernos usam aprendizado profundo para áudio.

19

A interação humano computador tornou-se mais intuitiva.

Romanian

This section lists text to speech models for Romanian.

vits-piper-ro_RO-mihai-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ro/ro_RO/mihai/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ro_RO-mihai-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-ro_RO-mihai-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx";
  config.model.vits.tokens = "vits-piper-ro_RO-mihai-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ro_RO-mihai-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Un foc fără lemne se stinge, o lume fără poveste moare.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ro_RO-mihai-medium.tar.bz2

You can use the following code to play with vits-piper-ro_RO-mihai-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx",
            data_dir="vits-piper-ro_RO-mihai-medium/espeak-ng-data",
            tokens="vits-piper-ro_RO-mihai-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Un foc fără lemne se stinge, o lume fără poveste moare.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-ro_RO-mihai-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx";
  config.model.vits.tokens = "vits-piper-ro_RO-mihai-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ro_RO-mihai-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Un foc fără lemne se stinge, o lume fără poveste moare.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-ro_RO-mihai-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx".into()),
                tokens: Some("vits-piper-ro_RO-mihai-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-ro_RO-mihai-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Un foc fără lemne se stinge, o lume fără poveste moare.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-ro_RO-mihai-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx',
        tokens: 'vits-piper-ro_RO-mihai-medium/tokens.txt',
        dataDir: 'vits-piper-ro_RO-mihai-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Un foc fără lemne se stinge, o lume fără poveste moare.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-ro_RO-mihai-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx',
    tokens: 'vits-piper-ro_RO-mihai-medium/tokens.txt',
    dataDir: 'vits-piper-ro_RO-mihai-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Un foc fără lemne se stinge, o lume fără poveste moare.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-ro_RO-mihai-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-ro_RO-mihai-medium/tokens.txt",
    dataDir: "vits-piper-ro_RO-mihai-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Un foc fără lemne se stinge, o lume fără poveste moare."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-ro_RO-mihai-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ro_RO-mihai-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ro_RO-mihai-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Un foc fără lemne se stinge, o lume fără poveste moare.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-ro_RO-mihai-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx",
        tokens = "vits-piper-ro_RO-mihai-medium/tokens.txt",
        dataDir = "vits-piper-ro_RO-mihai-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Un foc fără lemne se stinge, o lume fără poveste moare.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-ro_RO-mihai-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx");
    vits.setTokens("vits-piper-ro_RO-mihai-medium/tokens.txt");
    vits.setDataDir("vits-piper-ro_RO-mihai-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Un foc fără lemne se stinge, o lume fără poveste moare.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-ro_RO-mihai-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-ro_RO-mihai-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-ro_RO-mihai-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Un foc fără lemne se stinge, o lume fără poveste moare.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-ro_RO-mihai-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx",
				Tokens:  "vits-piper-ro_RO-mihai-medium/tokens.txt",
				DataDir: "vits-piper-ro_RO-mihai-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Un foc fără lemne se stinge, o lume fără poveste moare."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Un foc fără lemne se stinge, o lume fără poveste moare.

sample audios for different speakers are listed below:

Speaker 0

supertonic-3-ro

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Romanian (ro).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "ro"

audio = tts.generate("Acesta este un motor text to speech care folosește generația următoare de kadi", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Acesta este un motor text to speech care folosește generația următoare de kadi";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"ro\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Acesta este un motor text to speech care folosește generația următoare de kadi";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "ro"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Acesta este un motor text to speech care folosește generația următoare de kadi";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "ro"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Acesta este un motor text to speech care folosește generația următoare de kadi';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'ro'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'ro'},
  );
  final audio = tts.generateWithConfig(text: 'Acesta este un motor text to speech care folosește generația următoare de kadi', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Acesta este un motor text to speech care folosește generația următoare de kadi"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "ro"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Acesta este un motor text to speech care folosește generația următoare de kadi";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"ro\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "ro"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Acesta este un motor text to speech care folosește generația următoare de kadi",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Acesta este un motor text to speech care folosește generația următoare de kadi";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"ro\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "ro"}';

  Audio := Tts.GenerateWithConfig('Acesta este un motor text to speech care folosește generația următoare de kadi', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Acesta este un motor text to speech care folosește generația următoare de kadi"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "ro"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Salut lume.

1

Cum te simți astăzi?

2

Cerul este albastru, iar vântul este blând.

3

Învățarea automată ajută computerele să învețe din date.

4

Sinteza vocală transformă textul în sunet clar.

5

Elevii au citit o poveste scurtă la bibliotecă.

6

Trenul a întârziat din cauza lucrărilor la șine.

7

Modelele mici rulează rapid pe dispozitive locale.

8

Asistentul vocal ajută la sarcinile zilnice.

9

Citirea stabilă este importantă pentru propoziții scurte și lungi.

Speaker 1

0

Salut lume.

1

Cum te simți astăzi?

2

Cerul este albastru, iar vântul este blând.

3

Învățarea automată ajută computerele să învețe din date.

4

Sinteza vocală transformă textul în sunet clar.

5

Elevii au citit o poveste scurtă la bibliotecă.

6

Trenul a întârziat din cauza lucrărilor la șine.

7

Modelele mici rulează rapid pe dispozitive locale.

8

Asistentul vocal ajută la sarcinile zilnice.

9

Citirea stabilă este importantă pentru propoziții scurte și lungi.

Speaker 2

0

Salut lume.

1

Cum te simți astăzi?

2

Cerul este albastru, iar vântul este blând.

3

Învățarea automată ajută computerele să învețe din date.

4

Sinteza vocală transformă textul în sunet clar.

5

Elevii au citit o poveste scurtă la bibliotecă.

6

Trenul a întârziat din cauza lucrărilor la șine.

7

Modelele mici rulează rapid pe dispozitive locale.

8

Asistentul vocal ajută la sarcinile zilnice.

9

Citirea stabilă este importantă pentru propoziții scurte și lungi.

Speaker 3

0

Salut lume.

1

Cum te simți astăzi?

2

Cerul este albastru, iar vântul este blând.

3

Învățarea automată ajută computerele să învețe din date.

4

Sinteza vocală transformă textul în sunet clar.

5

Elevii au citit o poveste scurtă la bibliotecă.

6

Trenul a întârziat din cauza lucrărilor la șine.

7

Modelele mici rulează rapid pe dispozitive locale.

8

Asistentul vocal ajută la sarcinile zilnice.

9

Citirea stabilă este importantă pentru propoziții scurte și lungi.

Speaker 4

0

Salut lume.

1

Cum te simți astăzi?

2

Cerul este albastru, iar vântul este blând.

3

Învățarea automată ajută computerele să învețe din date.

4

Sinteza vocală transformă textul în sunet clar.

5

Elevii au citit o poveste scurtă la bibliotecă.

6

Trenul a întârziat din cauza lucrărilor la șine.

7

Modelele mici rulează rapid pe dispozitive locale.

8

Asistentul vocal ajută la sarcinile zilnice.

9

Citirea stabilă este importantă pentru propoziții scurte și lungi.

Speaker 5

0

Salut lume.

1

Cum te simți astăzi?

2

Cerul este albastru, iar vântul este blând.

3

Învățarea automată ajută computerele să învețe din date.

4

Sinteza vocală transformă textul în sunet clar.

5

Elevii au citit o poveste scurtă la bibliotecă.

6

Trenul a întârziat din cauza lucrărilor la șine.

7

Modelele mici rulează rapid pe dispozitive locale.

8

Asistentul vocal ajută la sarcinile zilnice.

9

Citirea stabilă este importantă pentru propoziții scurte și lungi.

Speaker 6

0

Salut lume.

1

Cum te simți astăzi?

2

Cerul este albastru, iar vântul este blând.

3

Învățarea automată ajută computerele să învețe din date.

4

Sinteza vocală transformă textul în sunet clar.

5

Elevii au citit o poveste scurtă la bibliotecă.

6

Trenul a întârziat din cauza lucrărilor la șine.

7

Modelele mici rulează rapid pe dispozitive locale.

8

Asistentul vocal ajută la sarcinile zilnice.

9

Citirea stabilă este importantă pentru propoziții scurte și lungi.

Speaker 7

0

Salut lume.

1

Cum te simți astăzi?

2

Cerul este albastru, iar vântul este blând.

3

Învățarea automată ajută computerele să învețe din date.

4

Sinteza vocală transformă textul în sunet clar.

5

Elevii au citit o poveste scurtă la bibliotecă.

6

Trenul a întârziat din cauza lucrărilor la șine.

7

Modelele mici rulează rapid pe dispozitive locale.

8

Asistentul vocal ajută la sarcinile zilnice.

9

Citirea stabilă este importantă pentru propoziții scurte și lungi.

Speaker 8

0

Salut lume.

1

Cum te simți astăzi?

2

Cerul este albastru, iar vântul este blând.

3

Învățarea automată ajută computerele să învețe din date.

4

Sinteza vocală transformă textul în sunet clar.

5

Elevii au citit o poveste scurtă la bibliotecă.

6

Trenul a întârziat din cauza lucrărilor la șine.

7

Modelele mici rulează rapid pe dispozitive locale.

8

Asistentul vocal ajută la sarcinile zilnice.

9

Citirea stabilă este importantă pentru propoziții scurte și lungi.

Speaker 9

0

Salut lume.

1

Cum te simți astăzi?

2

Cerul este albastru, iar vântul este blând.

3

Învățarea automată ajută computerele să învețe din date.

4

Sinteza vocală transformă textul în sunet clar.

5

Elevii au citit o poveste scurtă la bibliotecă.

6

Trenul a întârziat din cauza lucrărilor la șine.

7

Modelele mici rulează rapid pe dispozitive locale.

8

Asistentul vocal ajută la sarcinile zilnice.

9

Citirea stabilă este importantă pentru propoziții scurte și lungi.

Russian

This section lists text to speech models for Russian.

vits-piper-ru_RU-denis-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ru/ru_RU/denis/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ru_RU-denis-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-ru_RU-denis-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx";
  config.model.vits.tokens = "vits-piper-ru_RU-denis-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ru_RU-denis-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Если курица укусит, ей отрубят голову.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ru_RU-denis-medium.tar.bz2

You can use the following code to play with vits-piper-ru_RU-denis-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx",
            data_dir="vits-piper-ru_RU-denis-medium/espeak-ng-data",
            tokens="vits-piper-ru_RU-denis-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Если курица укусит, ей отрубят голову.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-ru_RU-denis-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx";
  config.model.vits.tokens = "vits-piper-ru_RU-denis-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ru_RU-denis-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Если курица укусит, ей отрубят голову.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-ru_RU-denis-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx".into()),
                tokens: Some("vits-piper-ru_RU-denis-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-ru_RU-denis-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Если курица укусит, ей отрубят голову.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-ru_RU-denis-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx',
        tokens: 'vits-piper-ru_RU-denis-medium/tokens.txt',
        dataDir: 'vits-piper-ru_RU-denis-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Если курица укусит, ей отрубят голову.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-ru_RU-denis-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx',
    tokens: 'vits-piper-ru_RU-denis-medium/tokens.txt',
    dataDir: 'vits-piper-ru_RU-denis-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Если курица укусит, ей отрубят голову.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-ru_RU-denis-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-ru_RU-denis-medium/tokens.txt",
    dataDir: "vits-piper-ru_RU-denis-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Если курица укусит, ей отрубят голову."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-ru_RU-denis-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ru_RU-denis-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ru_RU-denis-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Если курица укусит, ей отрубят голову.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-ru_RU-denis-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx",
        tokens = "vits-piper-ru_RU-denis-medium/tokens.txt",
        dataDir = "vits-piper-ru_RU-denis-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Если курица укусит, ей отрубят голову.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-ru_RU-denis-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx");
    vits.setTokens("vits-piper-ru_RU-denis-medium/tokens.txt");
    vits.setDataDir("vits-piper-ru_RU-denis-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Если курица укусит, ей отрубят голову.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-ru_RU-denis-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-ru_RU-denis-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-ru_RU-denis-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Если курица укусит, ей отрубят голову.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-ru_RU-denis-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx",
				Tokens:  "vits-piper-ru_RU-denis-medium/tokens.txt",
				DataDir: "vits-piper-ru_RU-denis-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Если курица укусит, ей отрубят голову."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Если курица укусит, ей отрубят голову.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-ru_RU-dmitri-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ru/ru_RU/dmitri/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ru_RU-dmitri-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-ru_RU-dmitri-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx";
  config.model.vits.tokens = "vits-piper-ru_RU-dmitri-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ru_RU-dmitri-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Если курица укусит, ей отрубят голову.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ru_RU-dmitri-medium.tar.bz2

You can use the following code to play with vits-piper-ru_RU-dmitri-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx",
            data_dir="vits-piper-ru_RU-dmitri-medium/espeak-ng-data",
            tokens="vits-piper-ru_RU-dmitri-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Если курица укусит, ей отрубят голову.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-ru_RU-dmitri-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx";
  config.model.vits.tokens = "vits-piper-ru_RU-dmitri-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ru_RU-dmitri-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Если курица укусит, ей отрубят голову.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-ru_RU-dmitri-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx".into()),
                tokens: Some("vits-piper-ru_RU-dmitri-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-ru_RU-dmitri-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Если курица укусит, ей отрубят голову.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-ru_RU-dmitri-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx',
        tokens: 'vits-piper-ru_RU-dmitri-medium/tokens.txt',
        dataDir: 'vits-piper-ru_RU-dmitri-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Если курица укусит, ей отрубят голову.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-ru_RU-dmitri-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx',
    tokens: 'vits-piper-ru_RU-dmitri-medium/tokens.txt',
    dataDir: 'vits-piper-ru_RU-dmitri-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Если курица укусит, ей отрубят голову.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-ru_RU-dmitri-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-ru_RU-dmitri-medium/tokens.txt",
    dataDir: "vits-piper-ru_RU-dmitri-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Если курица укусит, ей отрубят голову."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-ru_RU-dmitri-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ru_RU-dmitri-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ru_RU-dmitri-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Если курица укусит, ей отрубят голову.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-ru_RU-dmitri-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx",
        tokens = "vits-piper-ru_RU-dmitri-medium/tokens.txt",
        dataDir = "vits-piper-ru_RU-dmitri-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Если курица укусит, ей отрубят голову.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-ru_RU-dmitri-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx");
    vits.setTokens("vits-piper-ru_RU-dmitri-medium/tokens.txt");
    vits.setDataDir("vits-piper-ru_RU-dmitri-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Если курица укусит, ей отрубят голову.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-ru_RU-dmitri-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-ru_RU-dmitri-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-ru_RU-dmitri-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Если курица укусит, ей отрубят голову.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-ru_RU-dmitri-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx",
				Tokens:  "vits-piper-ru_RU-dmitri-medium/tokens.txt",
				DataDir: "vits-piper-ru_RU-dmitri-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Если курица укусит, ей отрубят голову."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Если курица укусит, ей отрубят голову.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-ru_RU-irina-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ru/ru_RU/irina/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ru_RU-irina-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-ru_RU-irina-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx";
  config.model.vits.tokens = "vits-piper-ru_RU-irina-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ru_RU-irina-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Если курица укусит, ей отрубят голову.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ru_RU-irina-medium.tar.bz2

You can use the following code to play with vits-piper-ru_RU-irina-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx",
            data_dir="vits-piper-ru_RU-irina-medium/espeak-ng-data",
            tokens="vits-piper-ru_RU-irina-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Если курица укусит, ей отрубят голову.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-ru_RU-irina-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx";
  config.model.vits.tokens = "vits-piper-ru_RU-irina-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ru_RU-irina-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Если курица укусит, ей отрубят голову.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-ru_RU-irina-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx".into()),
                tokens: Some("vits-piper-ru_RU-irina-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-ru_RU-irina-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Если курица укусит, ей отрубят голову.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-ru_RU-irina-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx',
        tokens: 'vits-piper-ru_RU-irina-medium/tokens.txt',
        dataDir: 'vits-piper-ru_RU-irina-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Если курица укусит, ей отрубят голову.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-ru_RU-irina-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx',
    tokens: 'vits-piper-ru_RU-irina-medium/tokens.txt',
    dataDir: 'vits-piper-ru_RU-irina-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Если курица укусит, ей отрубят голову.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-ru_RU-irina-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-ru_RU-irina-medium/tokens.txt",
    dataDir: "vits-piper-ru_RU-irina-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Если курица укусит, ей отрубят голову."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-ru_RU-irina-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ru_RU-irina-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ru_RU-irina-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Если курица укусит, ей отрубят голову.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-ru_RU-irina-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx",
        tokens = "vits-piper-ru_RU-irina-medium/tokens.txt",
        dataDir = "vits-piper-ru_RU-irina-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Если курица укусит, ей отрубят голову.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-ru_RU-irina-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx");
    vits.setTokens("vits-piper-ru_RU-irina-medium/tokens.txt");
    vits.setDataDir("vits-piper-ru_RU-irina-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Если курица укусит, ей отрубят голову.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-ru_RU-irina-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-ru_RU-irina-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-ru_RU-irina-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Если курица укусит, ей отрубят голову.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-ru_RU-irina-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx",
				Tokens:  "vits-piper-ru_RU-irina-medium/tokens.txt",
				DataDir: "vits-piper-ru_RU-irina-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Если курица укусит, ей отрубят голову."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Если курица укусит, ей отрубят голову.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-ru_RU-ruslan-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ru/ru_RU/ruslan/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ru_RU-ruslan-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-ru_RU-ruslan-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx";
  config.model.vits.tokens = "vits-piper-ru_RU-ruslan-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ru_RU-ruslan-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Если курица укусит, ей отрубят голову.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ru_RU-ruslan-medium.tar.bz2

You can use the following code to play with vits-piper-ru_RU-ruslan-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx",
            data_dir="vits-piper-ru_RU-ruslan-medium/espeak-ng-data",
            tokens="vits-piper-ru_RU-ruslan-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Если курица укусит, ей отрубят голову.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-ru_RU-ruslan-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx";
  config.model.vits.tokens = "vits-piper-ru_RU-ruslan-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ru_RU-ruslan-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Если курица укусит, ей отрубят голову.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-ru_RU-ruslan-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx".into()),
                tokens: Some("vits-piper-ru_RU-ruslan-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-ru_RU-ruslan-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Если курица укусит, ей отрубят голову.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-ru_RU-ruslan-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx',
        tokens: 'vits-piper-ru_RU-ruslan-medium/tokens.txt',
        dataDir: 'vits-piper-ru_RU-ruslan-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Если курица укусит, ей отрубят голову.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-ru_RU-ruslan-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx',
    tokens: 'vits-piper-ru_RU-ruslan-medium/tokens.txt',
    dataDir: 'vits-piper-ru_RU-ruslan-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Если курица укусит, ей отрубят голову.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-ru_RU-ruslan-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-ru_RU-ruslan-medium/tokens.txt",
    dataDir: "vits-piper-ru_RU-ruslan-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Если курица укусит, ей отрубят голову."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-ru_RU-ruslan-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ru_RU-ruslan-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ru_RU-ruslan-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Если курица укусит, ей отрубят голову.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-ru_RU-ruslan-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx",
        tokens = "vits-piper-ru_RU-ruslan-medium/tokens.txt",
        dataDir = "vits-piper-ru_RU-ruslan-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Если курица укусит, ей отрубят голову.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-ru_RU-ruslan-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx");
    vits.setTokens("vits-piper-ru_RU-ruslan-medium/tokens.txt");
    vits.setDataDir("vits-piper-ru_RU-ruslan-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Если курица укусит, ей отрубят голову.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-ru_RU-ruslan-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-ru_RU-ruslan-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-ru_RU-ruslan-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Если курица укусит, ей отрубят голову.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-ru_RU-ruslan-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx",
				Tokens:  "vits-piper-ru_RU-ruslan-medium/tokens.txt",
				DataDir: "vits-piper-ru_RU-ruslan-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Если курица укусит, ей отрубят голову."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Если курица укусит, ей отрубят голову.

sample audios for different speakers are listed below:

Speaker 0

supertonic-3-ru

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Russian (ru).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "ru"

audio = tts.generate("Это движок преобразования текста в речь, использующий Kaldi следующего поколения.", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Это движок преобразования текста в речь, использующий Kaldi следующего поколения.";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"ru\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Это движок преобразования текста в речь, использующий Kaldi следующего поколения.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "ru"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Это движок преобразования текста в речь, использующий Kaldi следующего поколения.";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "ru"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Это движок преобразования текста в речь, использующий Kaldi следующего поколения.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'ru'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'ru'},
  );
  final audio = tts.generateWithConfig(text: 'Это движок преобразования текста в речь, использующий Kaldi следующего поколения.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Это движок преобразования текста в речь, использующий Kaldi следующего поколения."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "ru"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Это движок преобразования текста в речь, использующий Kaldi следующего поколения.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"ru\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "ru"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Это движок преобразования текста в речь, использующий Kaldi следующего поколения.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Это движок преобразования текста в речь, использующий Kaldi следующего поколения.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"ru\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "ru"}';

  Audio := Tts.GenerateWithConfig('Это движок преобразования текста в речь, использующий Kaldi следующего поколения.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Это движок преобразования текста в речь, использующий Kaldi следующего поколения."

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "ru"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Привет мир.

1

Как у тебя дела сегодня?

2

Небо голубое, а ветер мягкий.

3

Машинное обучение помогает компьютерам учиться на данных.

4

Синтез речи превращает текст в понятный звук.

5

Ученики прочитали короткий рассказ в библиотеке.

6

Поезд задержался из-за ремонта путей.

7

Небольшие модели быстро работают на локальных устройствах.

8

Голосовой помощник помогает в повседневных задачах.

9

Стабильное чтение важно для коротких и длинных предложений.

Speaker 1

0

Привет мир.

1

Как у тебя дела сегодня?

2

Небо голубое, а ветер мягкий.

3

Машинное обучение помогает компьютерам учиться на данных.

4

Синтез речи превращает текст в понятный звук.

5

Ученики прочитали короткий рассказ в библиотеке.

6

Поезд задержался из-за ремонта путей.

7

Небольшие модели быстро работают на локальных устройствах.

8

Голосовой помощник помогает в повседневных задачах.

9

Стабильное чтение важно для коротких и длинных предложений.

Speaker 2

0

Привет мир.

1

Как у тебя дела сегодня?

2

Небо голубое, а ветер мягкий.

3

Машинное обучение помогает компьютерам учиться на данных.

4

Синтез речи превращает текст в понятный звук.

5

Ученики прочитали короткий рассказ в библиотеке.

6

Поезд задержался из-за ремонта путей.

7

Небольшие модели быстро работают на локальных устройствах.

8

Голосовой помощник помогает в повседневных задачах.

9

Стабильное чтение важно для коротких и длинных предложений.

Speaker 3

0

Привет мир.

1

Как у тебя дела сегодня?

2

Небо голубое, а ветер мягкий.

3

Машинное обучение помогает компьютерам учиться на данных.

4

Синтез речи превращает текст в понятный звук.

5

Ученики прочитали короткий рассказ в библиотеке.

6

Поезд задержался из-за ремонта путей.

7

Небольшие модели быстро работают на локальных устройствах.

8

Голосовой помощник помогает в повседневных задачах.

9

Стабильное чтение важно для коротких и длинных предложений.

Speaker 4

0

Привет мир.

1

Как у тебя дела сегодня?

2

Небо голубое, а ветер мягкий.

3

Машинное обучение помогает компьютерам учиться на данных.

4

Синтез речи превращает текст в понятный звук.

5

Ученики прочитали короткий рассказ в библиотеке.

6

Поезд задержался из-за ремонта путей.

7

Небольшие модели быстро работают на локальных устройствах.

8

Голосовой помощник помогает в повседневных задачах.

9

Стабильное чтение важно для коротких и длинных предложений.

Speaker 5

0

Привет мир.

1

Как у тебя дела сегодня?

2

Небо голубое, а ветер мягкий.

3

Машинное обучение помогает компьютерам учиться на данных.

4

Синтез речи превращает текст в понятный звук.

5

Ученики прочитали короткий рассказ в библиотеке.

6

Поезд задержался из-за ремонта путей.

7

Небольшие модели быстро работают на локальных устройствах.

8

Голосовой помощник помогает в повседневных задачах.

9

Стабильное чтение важно для коротких и длинных предложений.

Speaker 6

0

Привет мир.

1

Как у тебя дела сегодня?

2

Небо голубое, а ветер мягкий.

3

Машинное обучение помогает компьютерам учиться на данных.

4

Синтез речи превращает текст в понятный звук.

5

Ученики прочитали короткий рассказ в библиотеке.

6

Поезд задержался из-за ремонта путей.

7

Небольшие модели быстро работают на локальных устройствах.

8

Голосовой помощник помогает в повседневных задачах.

9

Стабильное чтение важно для коротких и длинных предложений.

Speaker 7

0

Привет мир.

1

Как у тебя дела сегодня?

2

Небо голубое, а ветер мягкий.

3

Машинное обучение помогает компьютерам учиться на данных.

4

Синтез речи превращает текст в понятный звук.

5

Ученики прочитали короткий рассказ в библиотеке.

6

Поезд задержался из-за ремонта путей.

7

Небольшие модели быстро работают на локальных устройствах.

8

Голосовой помощник помогает в повседневных задачах.

9

Стабильное чтение важно для коротких и длинных предложений.

Speaker 8

0

Привет мир.

1

Как у тебя дела сегодня?

2

Небо голубое, а ветер мягкий.

3

Машинное обучение помогает компьютерам учиться на данных.

4

Синтез речи превращает текст в понятный звук.

5

Ученики прочитали короткий рассказ в библиотеке.

6

Поезд задержался из-за ремонта путей.

7

Небольшие модели быстро работают на локальных устройствах.

8

Голосовой помощник помогает в повседневных задачах.

9

Стабильное чтение важно для коротких и длинных предложений.

Speaker 9

0

Привет мир.

1

Как у тебя дела сегодня?

2

Небо голубое, а ветер мягкий.

3

Машинное обучение помогает компьютерам учиться на данных.

4

Синтез речи превращает текст в понятный звук.

5

Ученики прочитали короткий рассказ в библиотеке.

6

Поезд задержался из-за ремонта путей.

7

Небольшие модели быстро работают на локальных устройствах.

8

Голосовой помощник помогает в повседневных задачах.

9

Стабильное чтение важно для коротких и длинных предложений.

Serbian

This section lists text to speech models for Serbian.

vits-piper-sr_RS-serbski_institut-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/sr/sr_RS/serbski_institut/medium

Number of speakersSample rate
222050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-sr_RS-serbski_institut-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx";
  config.model.vits.tokens = "vits-piper-sr_RS-serbski_institut-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Круг не може постојати без свог центра, а нација не може постојати без својих хероја.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-sr_RS-serbski_institut-medium.tar.bz2

You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx",
            data_dir="vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data",
            tokens="vits-piper-sr_RS-serbski_institut-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Круг не може постојати без свог центра, а нација не може постојати без својих хероја.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx";
  config.model.vits.tokens = "vits-piper-sr_RS-serbski_institut-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Круг не може постојати без свог центра, а нација не може постојати без својих хероја.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx".into()),
                tokens: Some("vits-piper-sr_RS-serbski_institut-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Круг не може постојати без свог центра, а нација не може постојати без својих хероја.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx',
        tokens: 'vits-piper-sr_RS-serbski_institut-medium/tokens.txt',
        dataDir: 'vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Круг не може постојати без свог центра, а нација не може постојати без својих хероја.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx',
    tokens: 'vits-piper-sr_RS-serbski_institut-medium/tokens.txt',
    dataDir: 'vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Круг не може постојати без свог центра, а нација не може постојати без својих хероја.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-sr_RS-serbski_institut-medium/tokens.txt",
    dataDir: "vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Круг не може постојати без свог центра, а нација не може постојати без својих хероја."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-sr_RS-serbski_institut-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Круг не може постојати без свог центра, а нација не може постојати без својих хероја.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx",
        tokens = "vits-piper-sr_RS-serbski_institut-medium/tokens.txt",
        dataDir = "vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Круг не може постојати без свог центра, а нација не може постојати без својих хероја.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx");
    vits.setTokens("vits-piper-sr_RS-serbski_institut-medium/tokens.txt");
    vits.setDataDir("vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Круг не може постојати без свог центра, а нација не може постојати без својих хероја.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-sr_RS-serbski_institut-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Круг не може постојати без свог центра, а нација не може постојати без својих хероја.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx",
				Tokens:  "vits-piper-sr_RS-serbski_institut-medium/tokens.txt",
				DataDir: "vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Круг не може постојати без свог центра, а нација не може постојати без својих хероја."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Круг не може постојати без свог центра, а нација не може постојати без својих хероја.

sample audios for different speakers are listed below:

Speaker 0

Speaker 1

Slovak

This section lists text to speech models for Slovak.

vits-piper-sk_SK-lili-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/sk/sk_SK/lili/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-sk_SK-lili-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-sk_SK-lili-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx";
  config.model.vits.tokens = "vits-piper-sk_SK-lili-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-sk_SK-lili-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Kto nepozná strach, nepozná vôľu.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-sk_SK-lili-medium.tar.bz2

You can use the following code to play with vits-piper-sk_SK-lili-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx",
            data_dir="vits-piper-sk_SK-lili-medium/espeak-ng-data",
            tokens="vits-piper-sk_SK-lili-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Kto nepozná strach, nepozná vôľu.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-sk_SK-lili-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx";
  config.model.vits.tokens = "vits-piper-sk_SK-lili-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-sk_SK-lili-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Kto nepozná strach, nepozná vôľu.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-sk_SK-lili-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx".into()),
                tokens: Some("vits-piper-sk_SK-lili-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-sk_SK-lili-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Kto nepozná strach, nepozná vôľu.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-sk_SK-lili-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx',
        tokens: 'vits-piper-sk_SK-lili-medium/tokens.txt',
        dataDir: 'vits-piper-sk_SK-lili-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Kto nepozná strach, nepozná vôľu.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-sk_SK-lili-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx',
    tokens: 'vits-piper-sk_SK-lili-medium/tokens.txt',
    dataDir: 'vits-piper-sk_SK-lili-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Kto nepozná strach, nepozná vôľu.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-sk_SK-lili-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-sk_SK-lili-medium/tokens.txt",
    dataDir: "vits-piper-sk_SK-lili-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Kto nepozná strach, nepozná vôľu."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-sk_SK-lili-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-sk_SK-lili-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-sk_SK-lili-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Kto nepozná strach, nepozná vôľu.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-sk_SK-lili-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx",
        tokens = "vits-piper-sk_SK-lili-medium/tokens.txt",
        dataDir = "vits-piper-sk_SK-lili-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Kto nepozná strach, nepozná vôľu.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-sk_SK-lili-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx");
    vits.setTokens("vits-piper-sk_SK-lili-medium/tokens.txt");
    vits.setDataDir("vits-piper-sk_SK-lili-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Kto nepozná strach, nepozná vôľu.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-sk_SK-lili-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-sk_SK-lili-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-sk_SK-lili-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Kto nepozná strach, nepozná vôľu.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-sk_SK-lili-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx",
				Tokens:  "vits-piper-sk_SK-lili-medium/tokens.txt",
				DataDir: "vits-piper-sk_SK-lili-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Kto nepozná strach, nepozná vôľu."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Kto nepozná strach, nepozná vôľu.

sample audios for different speakers are listed below:

Speaker 0

supertonic-3-sk

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Slovak (sk).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "sk"

audio = tts.generate("Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"sk\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "sk"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "sk"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'sk'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'sk'},
  );
  final audio = tts.generateWithConfig(text: 'Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "sk"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"sk\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "sk"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"sk\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "sk"}';

  Audio := Tts.GenerateWithConfig('Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "sk"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Ahoj svet.

1

Ako sa dnes máš?

2

Obloha je modrá a vietor je mierny.

3

Strojové učenie pomáha počítačom učiť sa z dát.

4

Syntéza reči premieňa text na zrozumiteľný zvuk.

5

Žiaci čítali krátky príbeh v knižnici.

6

Vlak meškal pre údržbu trate.

7

Malé modely bežia rýchlo na lokálnych zariadeniach.

8

Hlasový asistent pomáha s každodennými úlohami.

9

Stabilné čítanie je dôležité pre krátke aj dlhé vety.

Speaker 1

0

Ahoj svet.

1

Ako sa dnes máš?

2

Obloha je modrá a vietor je mierny.

3

Strojové učenie pomáha počítačom učiť sa z dát.

4

Syntéza reči premieňa text na zrozumiteľný zvuk.

5

Žiaci čítali krátky príbeh v knižnici.

6

Vlak meškal pre údržbu trate.

7

Malé modely bežia rýchlo na lokálnych zariadeniach.

8

Hlasový asistent pomáha s každodennými úlohami.

9

Stabilné čítanie je dôležité pre krátke aj dlhé vety.

Speaker 2

0

Ahoj svet.

1

Ako sa dnes máš?

2

Obloha je modrá a vietor je mierny.

3

Strojové učenie pomáha počítačom učiť sa z dát.

4

Syntéza reči premieňa text na zrozumiteľný zvuk.

5

Žiaci čítali krátky príbeh v knižnici.

6

Vlak meškal pre údržbu trate.

7

Malé modely bežia rýchlo na lokálnych zariadeniach.

8

Hlasový asistent pomáha s každodennými úlohami.

9

Stabilné čítanie je dôležité pre krátke aj dlhé vety.

Speaker 3

0

Ahoj svet.

1

Ako sa dnes máš?

2

Obloha je modrá a vietor je mierny.

3

Strojové učenie pomáha počítačom učiť sa z dát.

4

Syntéza reči premieňa text na zrozumiteľný zvuk.

5

Žiaci čítali krátky príbeh v knižnici.

6

Vlak meškal pre údržbu trate.

7

Malé modely bežia rýchlo na lokálnych zariadeniach.

8

Hlasový asistent pomáha s každodennými úlohami.

9

Stabilné čítanie je dôležité pre krátke aj dlhé vety.

Speaker 4

0

Ahoj svet.

1

Ako sa dnes máš?

2

Obloha je modrá a vietor je mierny.

3

Strojové učenie pomáha počítačom učiť sa z dát.

4

Syntéza reči premieňa text na zrozumiteľný zvuk.

5

Žiaci čítali krátky príbeh v knižnici.

6

Vlak meškal pre údržbu trate.

7

Malé modely bežia rýchlo na lokálnych zariadeniach.

8

Hlasový asistent pomáha s každodennými úlohami.

9

Stabilné čítanie je dôležité pre krátke aj dlhé vety.

Speaker 5

0

Ahoj svet.

1

Ako sa dnes máš?

2

Obloha je modrá a vietor je mierny.

3

Strojové učenie pomáha počítačom učiť sa z dát.

4

Syntéza reči premieňa text na zrozumiteľný zvuk.

5

Žiaci čítali krátky príbeh v knižnici.

6

Vlak meškal pre údržbu trate.

7

Malé modely bežia rýchlo na lokálnych zariadeniach.

8

Hlasový asistent pomáha s každodennými úlohami.

9

Stabilné čítanie je dôležité pre krátke aj dlhé vety.

Speaker 6

0

Ahoj svet.

1

Ako sa dnes máš?

2

Obloha je modrá a vietor je mierny.

3

Strojové učenie pomáha počítačom učiť sa z dát.

4

Syntéza reči premieňa text na zrozumiteľný zvuk.

5

Žiaci čítali krátky príbeh v knižnici.

6

Vlak meškal pre údržbu trate.

7

Malé modely bežia rýchlo na lokálnych zariadeniach.

8

Hlasový asistent pomáha s každodennými úlohami.

9

Stabilné čítanie je dôležité pre krátke aj dlhé vety.

Speaker 7

0

Ahoj svet.

1

Ako sa dnes máš?

2

Obloha je modrá a vietor je mierny.

3

Strojové učenie pomáha počítačom učiť sa z dát.

4

Syntéza reči premieňa text na zrozumiteľný zvuk.

5

Žiaci čítali krátky príbeh v knižnici.

6

Vlak meškal pre údržbu trate.

7

Malé modely bežia rýchlo na lokálnych zariadeniach.

8

Hlasový asistent pomáha s každodennými úlohami.

9

Stabilné čítanie je dôležité pre krátke aj dlhé vety.

Speaker 8

0

Ahoj svet.

1

Ako sa dnes máš?

2

Obloha je modrá a vietor je mierny.

3

Strojové učenie pomáha počítačom učiť sa z dát.

4

Syntéza reči premieňa text na zrozumiteľný zvuk.

5

Žiaci čítali krátky príbeh v knižnici.

6

Vlak meškal pre údržbu trate.

7

Malé modely bežia rýchlo na lokálnych zariadeniach.

8

Hlasový asistent pomáha s každodennými úlohami.

9

Stabilné čítanie je dôležité pre krátke aj dlhé vety.

Speaker 9

0

Ahoj svet.

1

Ako sa dnes máš?

2

Obloha je modrá a vietor je mierny.

3

Strojové učenie pomáha počítačom učiť sa z dát.

4

Syntéza reči premieňa text na zrozumiteľný zvuk.

5

Žiaci čítali krátky príbeh v knižnici.

6

Vlak meškal pre údržbu trate.

7

Malé modely bežia rýchlo na lokálnych zariadeniach.

8

Hlasový asistent pomáha s každodennými úlohami.

9

Stabilné čítanie je dôležité pre krátke aj dlhé vety.

Slovenian

This section lists text to speech models for Slovenian.

vits-piper-sl_SI-artur-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/sl/sl_SI/artur/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-sl_SI-artur-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-sl_SI-artur-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx";
  config.model.vits.tokens = "vits-piper-sl_SI-artur-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-sl_SI-artur-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Kto sa nebojí, nie je hlúpy.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-sl_SI-artur-medium.tar.bz2

You can use the following code to play with vits-piper-sl_SI-artur-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx",
            data_dir="vits-piper-sl_SI-artur-medium/espeak-ng-data",
            tokens="vits-piper-sl_SI-artur-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Kto sa nebojí, nie je hlúpy.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-sl_SI-artur-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx";
  config.model.vits.tokens = "vits-piper-sl_SI-artur-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-sl_SI-artur-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Kto sa nebojí, nie je hlúpy.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-sl_SI-artur-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx".into()),
                tokens: Some("vits-piper-sl_SI-artur-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-sl_SI-artur-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Kto sa nebojí, nie je hlúpy.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-sl_SI-artur-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx',
        tokens: 'vits-piper-sl_SI-artur-medium/tokens.txt',
        dataDir: 'vits-piper-sl_SI-artur-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Kto sa nebojí, nie je hlúpy.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-sl_SI-artur-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx',
    tokens: 'vits-piper-sl_SI-artur-medium/tokens.txt',
    dataDir: 'vits-piper-sl_SI-artur-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Kto sa nebojí, nie je hlúpy.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-sl_SI-artur-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-sl_SI-artur-medium/tokens.txt",
    dataDir: "vits-piper-sl_SI-artur-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Kto sa nebojí, nie je hlúpy."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-sl_SI-artur-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-sl_SI-artur-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-sl_SI-artur-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Kto sa nebojí, nie je hlúpy.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-sl_SI-artur-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx",
        tokens = "vits-piper-sl_SI-artur-medium/tokens.txt",
        dataDir = "vits-piper-sl_SI-artur-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Kto sa nebojí, nie je hlúpy.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-sl_SI-artur-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx");
    vits.setTokens("vits-piper-sl_SI-artur-medium/tokens.txt");
    vits.setDataDir("vits-piper-sl_SI-artur-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Kto sa nebojí, nie je hlúpy.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-sl_SI-artur-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-sl_SI-artur-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-sl_SI-artur-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Kto sa nebojí, nie je hlúpy.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-sl_SI-artur-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx",
				Tokens:  "vits-piper-sl_SI-artur-medium/tokens.txt",
				DataDir: "vits-piper-sl_SI-artur-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Kto sa nebojí, nie je hlúpy."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Kto sa nebojí, nie je hlúpy.

sample audios for different speakers are listed below:

Speaker 0

supertonic-3-sl

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Slovenian (sl).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "sl"

audio = tts.generate("To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"sl\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "sl"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "sl"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'sl'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'sl'},
  );
  final audio = tts.generateWithConfig(text: 'To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "sl"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"sl\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "sl"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"sl\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "sl"}';

  Audio := Tts.GenerateWithConfig('To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "sl"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Pozdravljen svet.

1

Kako si danes?

2

Nebo je modro in veter je nežen.

3

Strojno učenje pomaga računalnikom učiti se iz podatkov.

4

Sinteza govora pretvori besedilo v jasen zvok.

5

Učenci so v knjižnici prebrali kratko zgodbo.

6

Vlak je zamujal zaradi vzdrževanja tirov.

7

Majhni modeli hitro delujejo na lokalnih napravah.

8

Glasovni pomočnik pomaga pri vsakodnevnih opravilih.

9

Stabilno branje je pomembno za kratke in dolge stavke.

Speaker 1

0

Pozdravljen svet.

1

Kako si danes?

2

Nebo je modro in veter je nežen.

3

Strojno učenje pomaga računalnikom učiti se iz podatkov.

4

Sinteza govora pretvori besedilo v jasen zvok.

5

Učenci so v knjižnici prebrali kratko zgodbo.

6

Vlak je zamujal zaradi vzdrževanja tirov.

7

Majhni modeli hitro delujejo na lokalnih napravah.

8

Glasovni pomočnik pomaga pri vsakodnevnih opravilih.

9

Stabilno branje je pomembno za kratke in dolge stavke.

Speaker 2

0

Pozdravljen svet.

1

Kako si danes?

2

Nebo je modro in veter je nežen.

3

Strojno učenje pomaga računalnikom učiti se iz podatkov.

4

Sinteza govora pretvori besedilo v jasen zvok.

5

Učenci so v knjižnici prebrali kratko zgodbo.

6

Vlak je zamujal zaradi vzdrževanja tirov.

7

Majhni modeli hitro delujejo na lokalnih napravah.

8

Glasovni pomočnik pomaga pri vsakodnevnih opravilih.

9

Stabilno branje je pomembno za kratke in dolge stavke.

Speaker 3

0

Pozdravljen svet.

1

Kako si danes?

2

Nebo je modro in veter je nežen.

3

Strojno učenje pomaga računalnikom učiti se iz podatkov.

4

Sinteza govora pretvori besedilo v jasen zvok.

5

Učenci so v knjižnici prebrali kratko zgodbo.

6

Vlak je zamujal zaradi vzdrževanja tirov.

7

Majhni modeli hitro delujejo na lokalnih napravah.

8

Glasovni pomočnik pomaga pri vsakodnevnih opravilih.

9

Stabilno branje je pomembno za kratke in dolge stavke.

Speaker 4

0

Pozdravljen svet.

1

Kako si danes?

2

Nebo je modro in veter je nežen.

3

Strojno učenje pomaga računalnikom učiti se iz podatkov.

4

Sinteza govora pretvori besedilo v jasen zvok.

5

Učenci so v knjižnici prebrali kratko zgodbo.

6

Vlak je zamujal zaradi vzdrževanja tirov.

7

Majhni modeli hitro delujejo na lokalnih napravah.

8

Glasovni pomočnik pomaga pri vsakodnevnih opravilih.

9

Stabilno branje je pomembno za kratke in dolge stavke.

Speaker 5

0

Pozdravljen svet.

1

Kako si danes?

2

Nebo je modro in veter je nežen.

3

Strojno učenje pomaga računalnikom učiti se iz podatkov.

4

Sinteza govora pretvori besedilo v jasen zvok.

5

Učenci so v knjižnici prebrali kratko zgodbo.

6

Vlak je zamujal zaradi vzdrževanja tirov.

7

Majhni modeli hitro delujejo na lokalnih napravah.

8

Glasovni pomočnik pomaga pri vsakodnevnih opravilih.

9

Stabilno branje je pomembno za kratke in dolge stavke.

Speaker 6

0

Pozdravljen svet.

1

Kako si danes?

2

Nebo je modro in veter je nežen.

3

Strojno učenje pomaga računalnikom učiti se iz podatkov.

4

Sinteza govora pretvori besedilo v jasen zvok.

5

Učenci so v knjižnici prebrali kratko zgodbo.

6

Vlak je zamujal zaradi vzdrževanja tirov.

7

Majhni modeli hitro delujejo na lokalnih napravah.

8

Glasovni pomočnik pomaga pri vsakodnevnih opravilih.

9

Stabilno branje je pomembno za kratke in dolge stavke.

Speaker 7

0

Pozdravljen svet.

1

Kako si danes?

2

Nebo je modro in veter je nežen.

3

Strojno učenje pomaga računalnikom učiti se iz podatkov.

4

Sinteza govora pretvori besedilo v jasen zvok.

5

Učenci so v knjižnici prebrali kratko zgodbo.

6

Vlak je zamujal zaradi vzdrževanja tirov.

7

Majhni modeli hitro delujejo na lokalnih napravah.

8

Glasovni pomočnik pomaga pri vsakodnevnih opravilih.

9

Stabilno branje je pomembno za kratke in dolge stavke.

Speaker 8

0

Pozdravljen svet.

1

Kako si danes?

2

Nebo je modro in veter je nežen.

3

Strojno učenje pomaga računalnikom učiti se iz podatkov.

4

Sinteza govora pretvori besedilo v jasen zvok.

5

Učenci so v knjižnici prebrali kratko zgodbo.

6

Vlak je zamujal zaradi vzdrževanja tirov.

7

Majhni modeli hitro delujejo na lokalnih napravah.

8

Glasovni pomočnik pomaga pri vsakodnevnih opravilih.

9

Stabilno branje je pomembno za kratke in dolge stavke.

Speaker 9

0

Pozdravljen svet.

1

Kako si danes?

2

Nebo je modro in veter je nežen.

3

Strojno učenje pomaga računalnikom učiti se iz podatkov.

4

Sinteza govora pretvori besedilo v jasen zvok.

5

Učenci so v knjižnici prebrali kratko zgodbo.

6

Vlak je zamujal zaradi vzdrževanja tirov.

7

Majhni modeli hitro delujejo na lokalnih napravah.

8

Glasovni pomočnik pomaga pri vsakodnevnih opravilih.

9

Stabilno branje je pomembno za kratke in dolge stavke.

Spanish

This section lists text to speech models for Spanish.

vits-piper-es_AR-daniela-high

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/es/es_AR/daniela/high

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-es_AR-daniela-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-es_AR-daniela-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx";
  config.model.vits.tokens = "vits-piper-es_AR-daniela-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-es_AR-daniela-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-es_AR-daniela-high.tar.bz2

You can use the following code to play with vits-piper-es_AR-daniela-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx",
            data_dir="vits-piper-es_AR-daniela-high/espeak-ng-data",
            tokens="vits-piper-es_AR-daniela-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-es_AR-daniela-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx";
  config.model.vits.tokens = "vits-piper-es_AR-daniela-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-es_AR-daniela-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-es_AR-daniela-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx".into()),
                tokens: Some("vits-piper-es_AR-daniela-high/tokens.txt".into()),
                data_dir: Some("vits-piper-es_AR-daniela-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-es_AR-daniela-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx',
        tokens: 'vits-piper-es_AR-daniela-high/tokens.txt',
        dataDir: 'vits-piper-es_AR-daniela-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-es_AR-daniela-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx',
    tokens: 'vits-piper-es_AR-daniela-high/tokens.txt',
    dataDir: 'vits-piper-es_AR-daniela-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-es_AR-daniela-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx",
    lexicon: "",
    tokens: "vits-piper-es_AR-daniela-high/tokens.txt",
    dataDir: "vits-piper-es_AR-daniela-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-es_AR-daniela-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx";
config.Model.Vits.Tokens = "vits-piper-es_AR-daniela-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-es_AR-daniela-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-es_AR-daniela-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx",
        tokens = "vits-piper-es_AR-daniela-high/tokens.txt",
        dataDir = "vits-piper-es_AR-daniela-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-es_AR-daniela-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx");
    vits.setTokens("vits-piper-es_AR-daniela-high/tokens.txt");
    vits.setDataDir("vits-piper-es_AR-daniela-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-es_AR-daniela-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-es_AR-daniela-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-es_AR-daniela-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-es_AR-daniela-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx",
				Tokens:  "vits-piper-es_AR-daniela-high/tokens.txt",
				DataDir: "vits-piper-es_AR-daniela-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-es_ES-carlfm-x_low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/es/es_ES/carlfm/x_low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-es_ES-carlfm-x_low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-es_ES-carlfm-x_low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx";
  config.model.vits.tokens = "vits-piper-es_ES-carlfm-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-es_ES-carlfm-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-es_ES-carlfm-x_low.tar.bz2

You can use the following code to play with vits-piper-es_ES-carlfm-x_low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx",
            data_dir="vits-piper-es_ES-carlfm-x_low/espeak-ng-data",
            tokens="vits-piper-es_ES-carlfm-x_low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-es_ES-carlfm-x_low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx";
  config.model.vits.tokens = "vits-piper-es_ES-carlfm-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-es_ES-carlfm-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-es_ES-carlfm-x_low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx".into()),
                tokens: Some("vits-piper-es_ES-carlfm-x_low/tokens.txt".into()),
                data_dir: Some("vits-piper-es_ES-carlfm-x_low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-es_ES-carlfm-x_low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx',
        tokens: 'vits-piper-es_ES-carlfm-x_low/tokens.txt',
        dataDir: 'vits-piper-es_ES-carlfm-x_low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-es_ES-carlfm-x_low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx',
    tokens: 'vits-piper-es_ES-carlfm-x_low/tokens.txt',
    dataDir: 'vits-piper-es_ES-carlfm-x_low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-es_ES-carlfm-x_low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx",
    lexicon: "",
    tokens: "vits-piper-es_ES-carlfm-x_low/tokens.txt",
    dataDir: "vits-piper-es_ES-carlfm-x_low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-es_ES-carlfm-x_low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-es_ES-carlfm-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-es_ES-carlfm-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-es_ES-carlfm-x_low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx",
        tokens = "vits-piper-es_ES-carlfm-x_low/tokens.txt",
        dataDir = "vits-piper-es_ES-carlfm-x_low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-es_ES-carlfm-x_low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx");
    vits.setTokens("vits-piper-es_ES-carlfm-x_low/tokens.txt");
    vits.setDataDir("vits-piper-es_ES-carlfm-x_low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-es_ES-carlfm-x_low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-es_ES-carlfm-x_low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-es_ES-carlfm-x_low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-es_ES-carlfm-x_low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx",
				Tokens:  "vits-piper-es_ES-carlfm-x_low/tokens.txt",
				DataDir: "vits-piper-es_ES-carlfm-x_low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-es_ES-davefx-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/es/es_ES/davefx/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-es_ES-davefx-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-es_ES-davefx-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx";
  config.model.vits.tokens = "vits-piper-es_ES-davefx-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-es_ES-davefx-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-es_ES-davefx-medium.tar.bz2

You can use the following code to play with vits-piper-es_ES-davefx-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx",
            data_dir="vits-piper-es_ES-davefx-medium/espeak-ng-data",
            tokens="vits-piper-es_ES-davefx-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-es_ES-davefx-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx";
  config.model.vits.tokens = "vits-piper-es_ES-davefx-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-es_ES-davefx-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-es_ES-davefx-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx".into()),
                tokens: Some("vits-piper-es_ES-davefx-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-es_ES-davefx-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-es_ES-davefx-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx',
        tokens: 'vits-piper-es_ES-davefx-medium/tokens.txt',
        dataDir: 'vits-piper-es_ES-davefx-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-es_ES-davefx-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx',
    tokens: 'vits-piper-es_ES-davefx-medium/tokens.txt',
    dataDir: 'vits-piper-es_ES-davefx-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-es_ES-davefx-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-es_ES-davefx-medium/tokens.txt",
    dataDir: "vits-piper-es_ES-davefx-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-es_ES-davefx-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-es_ES-davefx-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-es_ES-davefx-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-es_ES-davefx-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx",
        tokens = "vits-piper-es_ES-davefx-medium/tokens.txt",
        dataDir = "vits-piper-es_ES-davefx-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-es_ES-davefx-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx");
    vits.setTokens("vits-piper-es_ES-davefx-medium/tokens.txt");
    vits.setDataDir("vits-piper-es_ES-davefx-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-es_ES-davefx-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-es_ES-davefx-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-es_ES-davefx-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-es_ES-davefx-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx",
				Tokens:  "vits-piper-es_ES-davefx-medium/tokens.txt",
				DataDir: "vits-piper-es_ES-davefx-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-es_ES-glados-medium

Info about this model

This model is converted from https://github.com/rhasspy/piper/issues/187#issuecomment-1802216304

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-es_ES-glados-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-es_ES-glados-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx";
  config.model.vits.tokens = "vits-piper-es_ES-glados-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-es_ES-glados-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-es_ES-glados-medium.tar.bz2

You can use the following code to play with vits-piper-es_ES-glados-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx",
            data_dir="vits-piper-es_ES-glados-medium/espeak-ng-data",
            tokens="vits-piper-es_ES-glados-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-es_ES-glados-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx";
  config.model.vits.tokens = "vits-piper-es_ES-glados-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-es_ES-glados-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-es_ES-glados-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx".into()),
                tokens: Some("vits-piper-es_ES-glados-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-es_ES-glados-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-es_ES-glados-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx',
        tokens: 'vits-piper-es_ES-glados-medium/tokens.txt',
        dataDir: 'vits-piper-es_ES-glados-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-es_ES-glados-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx',
    tokens: 'vits-piper-es_ES-glados-medium/tokens.txt',
    dataDir: 'vits-piper-es_ES-glados-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-es_ES-glados-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-es_ES-glados-medium/tokens.txt",
    dataDir: "vits-piper-es_ES-glados-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-es_ES-glados-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-es_ES-glados-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-es_ES-glados-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-es_ES-glados-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx",
        tokens = "vits-piper-es_ES-glados-medium/tokens.txt",
        dataDir = "vits-piper-es_ES-glados-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-es_ES-glados-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx");
    vits.setTokens("vits-piper-es_ES-glados-medium/tokens.txt");
    vits.setDataDir("vits-piper-es_ES-glados-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-es_ES-glados-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-es_ES-glados-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-es_ES-glados-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-es_ES-glados-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx",
				Tokens:  "vits-piper-es_ES-glados-medium/tokens.txt",
				DataDir: "vits-piper-es_ES-glados-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-es_ES-miro-high

Info about this model

This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_es-ES_miro

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-es_ES-miro-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-es_ES-miro-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-es_ES-miro-high/es_ES-miro-high.onnx";
  config.model.vits.tokens = "vits-piper-es_ES-miro-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-es_ES-miro-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-es_ES-miro-high.tar.bz2

You can use the following code to play with vits-piper-es_ES-miro-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-es_ES-miro-high/es_ES-miro-high.onnx",
            data_dir="vits-piper-es_ES-miro-high/espeak-ng-data",
            tokens="vits-piper-es_ES-miro-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-es_ES-miro-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-es_ES-miro-high/es_ES-miro-high.onnx";
  config.model.vits.tokens = "vits-piper-es_ES-miro-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-es_ES-miro-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-es_ES-miro-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-es_ES-miro-high/es_ES-miro-high.onnx".into()),
                tokens: Some("vits-piper-es_ES-miro-high/tokens.txt".into()),
                data_dir: Some("vits-piper-es_ES-miro-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-es_ES-miro-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-es_ES-miro-high/es_ES-miro-high.onnx',
        tokens: 'vits-piper-es_ES-miro-high/tokens.txt',
        dataDir: 'vits-piper-es_ES-miro-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-es_ES-miro-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-es_ES-miro-high/es_ES-miro-high.onnx',
    tokens: 'vits-piper-es_ES-miro-high/tokens.txt',
    dataDir: 'vits-piper-es_ES-miro-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-es_ES-miro-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-es_ES-miro-high/es_ES-miro-high.onnx",
    lexicon: "",
    tokens: "vits-piper-es_ES-miro-high/tokens.txt",
    dataDir: "vits-piper-es_ES-miro-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-es_ES-miro-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-es_ES-miro-high/es_ES-miro-high.onnx";
config.Model.Vits.Tokens = "vits-piper-es_ES-miro-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-es_ES-miro-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-es_ES-miro-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-es_ES-miro-high/es_ES-miro-high.onnx",
        tokens = "vits-piper-es_ES-miro-high/tokens.txt",
        dataDir = "vits-piper-es_ES-miro-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-es_ES-miro-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-es_ES-miro-high/es_ES-miro-high.onnx");
    vits.setTokens("vits-piper-es_ES-miro-high/tokens.txt");
    vits.setDataDir("vits-piper-es_ES-miro-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-es_ES-miro-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-es_ES-miro-high/es_ES-miro-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-es_ES-miro-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-es_ES-miro-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-es_ES-miro-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-es_ES-miro-high/es_ES-miro-high.onnx",
				Tokens:  "vits-piper-es_ES-miro-high/tokens.txt",
				DataDir: "vits-piper-es_ES-miro-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-es_ES-sharvard-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/es/es_ES/sharvard/medium

Number of speakersSample rate
222050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-es_ES-sharvard-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-es_ES-sharvard-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx";
  config.model.vits.tokens = "vits-piper-es_ES-sharvard-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-es_ES-sharvard-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-es_ES-sharvard-medium.tar.bz2

You can use the following code to play with vits-piper-es_ES-sharvard-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx",
            data_dir="vits-piper-es_ES-sharvard-medium/espeak-ng-data",
            tokens="vits-piper-es_ES-sharvard-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-es_ES-sharvard-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx";
  config.model.vits.tokens = "vits-piper-es_ES-sharvard-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-es_ES-sharvard-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-es_ES-sharvard-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx".into()),
                tokens: Some("vits-piper-es_ES-sharvard-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-es_ES-sharvard-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-es_ES-sharvard-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx',
        tokens: 'vits-piper-es_ES-sharvard-medium/tokens.txt',
        dataDir: 'vits-piper-es_ES-sharvard-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-es_ES-sharvard-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx',
    tokens: 'vits-piper-es_ES-sharvard-medium/tokens.txt',
    dataDir: 'vits-piper-es_ES-sharvard-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-es_ES-sharvard-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-es_ES-sharvard-medium/tokens.txt",
    dataDir: "vits-piper-es_ES-sharvard-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-es_ES-sharvard-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-es_ES-sharvard-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-es_ES-sharvard-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-es_ES-sharvard-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx",
        tokens = "vits-piper-es_ES-sharvard-medium/tokens.txt",
        dataDir = "vits-piper-es_ES-sharvard-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-es_ES-sharvard-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx");
    vits.setTokens("vits-piper-es_ES-sharvard-medium/tokens.txt");
    vits.setDataDir("vits-piper-es_ES-sharvard-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-es_ES-sharvard-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-es_ES-sharvard-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-es_ES-sharvard-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-es_ES-sharvard-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx",
				Tokens:  "vits-piper-es_ES-sharvard-medium/tokens.txt",
				DataDir: "vits-piper-es_ES-sharvard-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.

sample audios for different speakers are listed below:

Speaker 0

Speaker 1

vits-piper-es_MX-ald-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/es/es_MX/ald/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-es_MX-ald-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-es_MX-ald-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx";
  config.model.vits.tokens = "vits-piper-es_MX-ald-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-es_MX-ald-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-es_MX-ald-medium.tar.bz2

You can use the following code to play with vits-piper-es_MX-ald-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx",
            data_dir="vits-piper-es_MX-ald-medium/espeak-ng-data",
            tokens="vits-piper-es_MX-ald-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-es_MX-ald-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx";
  config.model.vits.tokens = "vits-piper-es_MX-ald-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-es_MX-ald-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-es_MX-ald-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx".into()),
                tokens: Some("vits-piper-es_MX-ald-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-es_MX-ald-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-es_MX-ald-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx',
        tokens: 'vits-piper-es_MX-ald-medium/tokens.txt',
        dataDir: 'vits-piper-es_MX-ald-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-es_MX-ald-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx',
    tokens: 'vits-piper-es_MX-ald-medium/tokens.txt',
    dataDir: 'vits-piper-es_MX-ald-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-es_MX-ald-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-es_MX-ald-medium/tokens.txt",
    dataDir: "vits-piper-es_MX-ald-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-es_MX-ald-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-es_MX-ald-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-es_MX-ald-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-es_MX-ald-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx",
        tokens = "vits-piper-es_MX-ald-medium/tokens.txt",
        dataDir = "vits-piper-es_MX-ald-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-es_MX-ald-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx");
    vits.setTokens("vits-piper-es_MX-ald-medium/tokens.txt");
    vits.setDataDir("vits-piper-es_MX-ald-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-es_MX-ald-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-es_MX-ald-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-es_MX-ald-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-es_MX-ald-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx",
				Tokens:  "vits-piper-es_MX-ald-medium/tokens.txt",
				DataDir: "vits-piper-es_MX-ald-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-es_MX-claude-high

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/es/es_MX/claude/high

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-es_MX-claude-high.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-es_MX-claude-high with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-es_MX-claude-high/es_MX-claude-high.onnx";
  config.model.vits.tokens = "vits-piper-es_MX-claude-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-es_MX-claude-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-es_MX-claude-high.tar.bz2

You can use the following code to play with vits-piper-es_MX-claude-high

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-es_MX-claude-high/es_MX-claude-high.onnx",
            data_dir="vits-piper-es_MX-claude-high/espeak-ng-data",
            tokens="vits-piper-es_MX-claude-high/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-es_MX-claude-high with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-es_MX-claude-high/es_MX-claude-high.onnx";
  config.model.vits.tokens = "vits-piper-es_MX-claude-high/tokens.txt";
  config.model.vits.data_dir = "vits-piper-es_MX-claude-high/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-es_MX-claude-high with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-es_MX-claude-high/es_MX-claude-high.onnx".into()),
                tokens: Some("vits-piper-es_MX-claude-high/tokens.txt".into()),
                data_dir: Some("vits-piper-es_MX-claude-high/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-es_MX-claude-high with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-es_MX-claude-high/es_MX-claude-high.onnx',
        tokens: 'vits-piper-es_MX-claude-high/tokens.txt',
        dataDir: 'vits-piper-es_MX-claude-high/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-es_MX-claude-high with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-es_MX-claude-high/es_MX-claude-high.onnx',
    tokens: 'vits-piper-es_MX-claude-high/tokens.txt',
    dataDir: 'vits-piper-es_MX-claude-high/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-es_MX-claude-high with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-es_MX-claude-high/es_MX-claude-high.onnx",
    lexicon: "",
    tokens: "vits-piper-es_MX-claude-high/tokens.txt",
    dataDir: "vits-piper-es_MX-claude-high/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-es_MX-claude-high with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-es_MX-claude-high/es_MX-claude-high.onnx";
config.Model.Vits.Tokens = "vits-piper-es_MX-claude-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-es_MX-claude-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-es_MX-claude-high with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-es_MX-claude-high/es_MX-claude-high.onnx",
        tokens = "vits-piper-es_MX-claude-high/tokens.txt",
        dataDir = "vits-piper-es_MX-claude-high/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-es_MX-claude-high with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-es_MX-claude-high/es_MX-claude-high.onnx");
    vits.setTokens("vits-piper-es_MX-claude-high/tokens.txt");
    vits.setDataDir("vits-piper-es_MX-claude-high/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-es_MX-claude-high with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-es_MX-claude-high/es_MX-claude-high.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-es_MX-claude-high/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-es_MX-claude-high/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-es_MX-claude-high with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-es_MX-claude-high/es_MX-claude-high.onnx",
				Tokens:  "vits-piper-es_MX-claude-high/tokens.txt",
				DataDir: "vits-piper-es_MX-claude-high/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.

sample audios for different speakers are listed below:

Speaker 0

supertonic-3-es

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Spanish (es).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "es"

audio = tts.generate("Este es un motor de texto a voz que utiliza kaldi de próxima generación.", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Este es un motor de texto a voz que utiliza kaldi de próxima generación.";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"es\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Este es un motor de texto a voz que utiliza kaldi de próxima generación.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "es"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Este es un motor de texto a voz que utiliza kaldi de próxima generación.";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "es"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Este es un motor de texto a voz que utiliza kaldi de próxima generación.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'es'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'es'},
  );
  final audio = tts.generateWithConfig(text: 'Este es un motor de texto a voz que utiliza kaldi de próxima generación.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Este es un motor de texto a voz que utiliza kaldi de próxima generación."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "es"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Este es un motor de texto a voz que utiliza kaldi de próxima generación.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"es\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "es"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Este es un motor de texto a voz que utiliza kaldi de próxima generación.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Este es un motor de texto a voz que utiliza kaldi de próxima generación.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"es\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "es"}';

  Audio := Tts.GenerateWithConfig('Este es un motor de texto a voz que utiliza kaldi de próxima generación.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Este es un motor de texto a voz que utiliza kaldi de próxima generación."

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "es"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Hola mundo.

1

¿Cómo estás hoy?

2

El cielo es azul.

3

Me encanta el aprendizaje automático.

4

Python es increíble.

5

Buenos días a todos.

6

La inteligencia artificial está creciendo.

7

La síntesis de voz es fascinante.

8

Las redes neuronales son poderosas.

9

El texto a voz convierte texto en audio.

10

El veloz marrón salta sobre el perro perezoso.

11

El aprendizaje automático permite a las computadoras aprender.

12

El procesamiento del lenguaje natural ayuda a las máquinas.

13

El aprendizaje profundo ha revolucionado la inteligencia artificial.

14

La tecnología de síntesis de voz ha avanzado significativamente.

15

La clonación de voz neuronal puede replicar estilos de habla.

16

La normalización de texto es importante para la pronunciación.

17

Los asistentes de voz nos ayudan a interactuar con la tecnología.

18

Los sistemas TTS modernos utilizan aprendizaje profundo.

19

La interacción humano computadora se ha vuelto más intuitiva.

Speaker 1

0

Hola mundo.

1

¿Cómo estás hoy?

2

El cielo es azul.

3

Me encanta el aprendizaje automático.

4

Python es increíble.

5

Buenos días a todos.

6

La inteligencia artificial está creciendo.

7

La síntesis de voz es fascinante.

8

Las redes neuronales son poderosas.

9

El texto a voz convierte texto en audio.

10

El veloz marrón salta sobre el perro perezoso.

11

El aprendizaje automático permite a las computadoras aprender.

12

El procesamiento del lenguaje natural ayuda a las máquinas.

13

El aprendizaje profundo ha revolucionado la inteligencia artificial.

14

La tecnología de síntesis de voz ha avanzado significativamente.

15

La clonación de voz neuronal puede replicar estilos de habla.

16

La normalización de texto es importante para la pronunciación.

17

Los asistentes de voz nos ayudan a interactuar con la tecnología.

18

Los sistemas TTS modernos utilizan aprendizaje profundo.

19

La interacción humano computadora se ha vuelto más intuitiva.

Speaker 2

0

Hola mundo.

1

¿Cómo estás hoy?

2

El cielo es azul.

3

Me encanta el aprendizaje automático.

4

Python es increíble.

5

Buenos días a todos.

6

La inteligencia artificial está creciendo.

7

La síntesis de voz es fascinante.

8

Las redes neuronales son poderosas.

9

El texto a voz convierte texto en audio.

10

El veloz marrón salta sobre el perro perezoso.

11

El aprendizaje automático permite a las computadoras aprender.

12

El procesamiento del lenguaje natural ayuda a las máquinas.

13

El aprendizaje profundo ha revolucionado la inteligencia artificial.

14

La tecnología de síntesis de voz ha avanzado significativamente.

15

La clonación de voz neuronal puede replicar estilos de habla.

16

La normalización de texto es importante para la pronunciación.

17

Los asistentes de voz nos ayudan a interactuar con la tecnología.

18

Los sistemas TTS modernos utilizan aprendizaje profundo.

19

La interacción humano computadora se ha vuelto más intuitiva.

Speaker 3

0

Hola mundo.

1

¿Cómo estás hoy?

2

El cielo es azul.

3

Me encanta el aprendizaje automático.

4

Python es increíble.

5

Buenos días a todos.

6

La inteligencia artificial está creciendo.

7

La síntesis de voz es fascinante.

8

Las redes neuronales son poderosas.

9

El texto a voz convierte texto en audio.

10

El veloz marrón salta sobre el perro perezoso.

11

El aprendizaje automático permite a las computadoras aprender.

12

El procesamiento del lenguaje natural ayuda a las máquinas.

13

El aprendizaje profundo ha revolucionado la inteligencia artificial.

14

La tecnología de síntesis de voz ha avanzado significativamente.

15

La clonación de voz neuronal puede replicar estilos de habla.

16

La normalización de texto es importante para la pronunciación.

17

Los asistentes de voz nos ayudan a interactuar con la tecnología.

18

Los sistemas TTS modernos utilizan aprendizaje profundo.

19

La interacción humano computadora se ha vuelto más intuitiva.

Speaker 4

0

Hola mundo.

1

¿Cómo estás hoy?

2

El cielo es azul.

3

Me encanta el aprendizaje automático.

4

Python es increíble.

5

Buenos días a todos.

6

La inteligencia artificial está creciendo.

7

La síntesis de voz es fascinante.

8

Las redes neuronales son poderosas.

9

El texto a voz convierte texto en audio.

10

El veloz marrón salta sobre el perro perezoso.

11

El aprendizaje automático permite a las computadoras aprender.

12

El procesamiento del lenguaje natural ayuda a las máquinas.

13

El aprendizaje profundo ha revolucionado la inteligencia artificial.

14

La tecnología de síntesis de voz ha avanzado significativamente.

15

La clonación de voz neuronal puede replicar estilos de habla.

16

La normalización de texto es importante para la pronunciación.

17

Los asistentes de voz nos ayudan a interactuar con la tecnología.

18

Los sistemas TTS modernos utilizan aprendizaje profundo.

19

La interacción humano computadora se ha vuelto más intuitiva.

Speaker 5

0

Hola mundo.

1

¿Cómo estás hoy?

2

El cielo es azul.

3

Me encanta el aprendizaje automático.

4

Python es increíble.

5

Buenos días a todos.

6

La inteligencia artificial está creciendo.

7

La síntesis de voz es fascinante.

8

Las redes neuronales son poderosas.

9

El texto a voz convierte texto en audio.

10

El veloz marrón salta sobre el perro perezoso.

11

El aprendizaje automático permite a las computadoras aprender.

12

El procesamiento del lenguaje natural ayuda a las máquinas.

13

El aprendizaje profundo ha revolucionado la inteligencia artificial.

14

La tecnología de síntesis de voz ha avanzado significativamente.

15

La clonación de voz neuronal puede replicar estilos de habla.

16

La normalización de texto es importante para la pronunciación.

17

Los asistentes de voz nos ayudan a interactuar con la tecnología.

18

Los sistemas TTS modernos utilizan aprendizaje profundo.

19

La interacción humano computadora se ha vuelto más intuitiva.

Speaker 6

0

Hola mundo.

1

¿Cómo estás hoy?

2

El cielo es azul.

3

Me encanta el aprendizaje automático.

4

Python es increíble.

5

Buenos días a todos.

6

La inteligencia artificial está creciendo.

7

La síntesis de voz es fascinante.

8

Las redes neuronales son poderosas.

9

El texto a voz convierte texto en audio.

10

El veloz marrón salta sobre el perro perezoso.

11

El aprendizaje automático permite a las computadoras aprender.

12

El procesamiento del lenguaje natural ayuda a las máquinas.

13

El aprendizaje profundo ha revolucionado la inteligencia artificial.

14

La tecnología de síntesis de voz ha avanzado significativamente.

15

La clonación de voz neuronal puede replicar estilos de habla.

16

La normalización de texto es importante para la pronunciación.

17

Los asistentes de voz nos ayudan a interactuar con la tecnología.

18

Los sistemas TTS modernos utilizan aprendizaje profundo.

19

La interacción humano computadora se ha vuelto más intuitiva.

Speaker 7

0

Hola mundo.

1

¿Cómo estás hoy?

2

El cielo es azul.

3

Me encanta el aprendizaje automático.

4

Python es increíble.

5

Buenos días a todos.

6

La inteligencia artificial está creciendo.

7

La síntesis de voz es fascinante.

8

Las redes neuronales son poderosas.

9

El texto a voz convierte texto en audio.

10

El veloz marrón salta sobre el perro perezoso.

11

El aprendizaje automático permite a las computadoras aprender.

12

El procesamiento del lenguaje natural ayuda a las máquinas.

13

El aprendizaje profundo ha revolucionado la inteligencia artificial.

14

La tecnología de síntesis de voz ha avanzado significativamente.

15

La clonación de voz neuronal puede replicar estilos de habla.

16

La normalización de texto es importante para la pronunciación.

17

Los asistentes de voz nos ayudan a interactuar con la tecnología.

18

Los sistemas TTS modernos utilizan aprendizaje profundo.

19

La interacción humano computadora se ha vuelto más intuitiva.

Speaker 8

0

Hola mundo.

1

¿Cómo estás hoy?

2

El cielo es azul.

3

Me encanta el aprendizaje automático.

4

Python es increíble.

5

Buenos días a todos.

6

La inteligencia artificial está creciendo.

7

La síntesis de voz es fascinante.

8

Las redes neuronales son poderosas.

9

El texto a voz convierte texto en audio.

10

El veloz marrón salta sobre el perro perezoso.

11

El aprendizaje automático permite a las computadoras aprender.

12

El procesamiento del lenguaje natural ayuda a las máquinas.

13

El aprendizaje profundo ha revolucionado la inteligencia artificial.

14

La tecnología de síntesis de voz ha avanzado significativamente.

15

La clonación de voz neuronal puede replicar estilos de habla.

16

La normalización de texto es importante para la pronunciación.

17

Los asistentes de voz nos ayudan a interactuar con la tecnología.

18

Los sistemas TTS modernos utilizan aprendizaje profundo.

19

La interacción humano computadora se ha vuelto más intuitiva.

Speaker 9

0

Hola mundo.

1

¿Cómo estás hoy?

2

El cielo es azul.

3

Me encanta el aprendizaje automático.

4

Python es increíble.

5

Buenos días a todos.

6

La inteligencia artificial está creciendo.

7

La síntesis de voz es fascinante.

8

Las redes neuronales son poderosas.

9

El texto a voz convierte texto en audio.

10

El veloz marrón salta sobre el perro perezoso.

11

El aprendizaje automático permite a las computadoras aprender.

12

El procesamiento del lenguaje natural ayuda a las máquinas.

13

El aprendizaje profundo ha revolucionado la inteligencia artificial.

14

La tecnología de síntesis de voz ha avanzado significativamente.

15

La clonación de voz neuronal puede replicar estilos de habla.

16

La normalización de texto es importante para la pronunciación.

17

Los asistentes de voz nos ayudan a interactuar con la tecnología.

18

Los sistemas TTS modernos utilizan aprendizaje profundo.

19

La interacción humano computadora se ha vuelto más intuitiva.

Swahili

This section lists text to speech models for Swahili.

vits-piper-sw_CD-lanfrica-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/sw/sw_CD/lanfrica/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-sw_CD-lanfrica-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx";
  config.model.vits.tokens = "vits-piper-sw_CD-lanfrica-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-sw_CD-lanfrica-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Mtu mmoja hawezi kuiba mazingira.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-sw_CD-lanfrica-medium.tar.bz2

You can use the following code to play with vits-piper-sw_CD-lanfrica-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx",
            data_dir="vits-piper-sw_CD-lanfrica-medium/espeak-ng-data",
            tokens="vits-piper-sw_CD-lanfrica-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Mtu mmoja hawezi kuiba mazingira.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx";
  config.model.vits.tokens = "vits-piper-sw_CD-lanfrica-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-sw_CD-lanfrica-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Mtu mmoja hawezi kuiba mazingira.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx".into()),
                tokens: Some("vits-piper-sw_CD-lanfrica-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-sw_CD-lanfrica-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Mtu mmoja hawezi kuiba mazingira.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx',
        tokens: 'vits-piper-sw_CD-lanfrica-medium/tokens.txt',
        dataDir: 'vits-piper-sw_CD-lanfrica-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Mtu mmoja hawezi kuiba mazingira.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx',
    tokens: 'vits-piper-sw_CD-lanfrica-medium/tokens.txt',
    dataDir: 'vits-piper-sw_CD-lanfrica-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Mtu mmoja hawezi kuiba mazingira.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-sw_CD-lanfrica-medium/tokens.txt",
    dataDir: "vits-piper-sw_CD-lanfrica-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Mtu mmoja hawezi kuiba mazingira."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-sw_CD-lanfrica-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-sw_CD-lanfrica-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Mtu mmoja hawezi kuiba mazingira.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx",
        tokens = "vits-piper-sw_CD-lanfrica-medium/tokens.txt",
        dataDir = "vits-piper-sw_CD-lanfrica-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Mtu mmoja hawezi kuiba mazingira.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx");
    vits.setTokens("vits-piper-sw_CD-lanfrica-medium/tokens.txt");
    vits.setDataDir("vits-piper-sw_CD-lanfrica-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Mtu mmoja hawezi kuiba mazingira.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-sw_CD-lanfrica-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-sw_CD-lanfrica-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Mtu mmoja hawezi kuiba mazingira.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx",
				Tokens:  "vits-piper-sw_CD-lanfrica-medium/tokens.txt",
				DataDir: "vits-piper-sw_CD-lanfrica-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Mtu mmoja hawezi kuiba mazingira."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Mtu mmoja hawezi kuiba mazingira.

sample audios for different speakers are listed below:

Speaker 0

Swedish

This section lists text to speech models for Swedish.

vits-piper-sv_SE-alma-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/sv/sv_SE/alma/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-sv_SE-alma-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-sv_SE-alma-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx";
  config.model.vits.tokens = "vits-piper-sv_SE-alma-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-sv_SE-alma-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Liten skog, med många träd";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-sv_SE-alma-medium.tar.bz2

You can use the following code to play with vits-piper-sv_SE-alma-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx",
            data_dir="vits-piper-sv_SE-alma-medium/espeak-ng-data",
            tokens="vits-piper-sv_SE-alma-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Liten skog, med många träd",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-sv_SE-alma-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx";
  config.model.vits.tokens = "vits-piper-sv_SE-alma-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-sv_SE-alma-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Liten skog, med många träd";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-sv_SE-alma-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx".into()),
                tokens: Some("vits-piper-sv_SE-alma-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-sv_SE-alma-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Liten skog, med många träd";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-sv_SE-alma-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx',
        tokens: 'vits-piper-sv_SE-alma-medium/tokens.txt',
        dataDir: 'vits-piper-sv_SE-alma-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Liten skog, med många träd';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-sv_SE-alma-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx',
    tokens: 'vits-piper-sv_SE-alma-medium/tokens.txt',
    dataDir: 'vits-piper-sv_SE-alma-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Liten skog, med många träd', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-sv_SE-alma-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-sv_SE-alma-medium/tokens.txt",
    dataDir: "vits-piper-sv_SE-alma-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Liten skog, med många träd"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-sv_SE-alma-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-sv_SE-alma-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-sv_SE-alma-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Liten skog, med många träd";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-sv_SE-alma-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx",
        tokens = "vits-piper-sv_SE-alma-medium/tokens.txt",
        dataDir = "vits-piper-sv_SE-alma-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Liten skog, med många träd",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-sv_SE-alma-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx");
    vits.setTokens("vits-piper-sv_SE-alma-medium/tokens.txt");
    vits.setDataDir("vits-piper-sv_SE-alma-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Liten skog, med många träd";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-sv_SE-alma-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-sv_SE-alma-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-sv_SE-alma-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Liten skog, med många träd', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-sv_SE-alma-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx",
				Tokens:  "vits-piper-sv_SE-alma-medium/tokens.txt",
				DataDir: "vits-piper-sv_SE-alma-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Liten skog, med många träd"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Liten skog, med många träd

sample audios for different speakers are listed below:

Speaker 0

vits-piper-sv_SE-lisa-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/sv/sv_SE/lisa/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-sv_SE-lisa-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-sv_SE-lisa-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx";
  config.model.vits.tokens = "vits-piper-sv_SE-lisa-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-sv_SE-lisa-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Liten skog, med många träd";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-sv_SE-lisa-medium.tar.bz2

You can use the following code to play with vits-piper-sv_SE-lisa-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx",
            data_dir="vits-piper-sv_SE-lisa-medium/espeak-ng-data",
            tokens="vits-piper-sv_SE-lisa-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Liten skog, med många träd",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-sv_SE-lisa-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx";
  config.model.vits.tokens = "vits-piper-sv_SE-lisa-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-sv_SE-lisa-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Liten skog, med många träd";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-sv_SE-lisa-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx".into()),
                tokens: Some("vits-piper-sv_SE-lisa-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-sv_SE-lisa-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Liten skog, med många träd";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-sv_SE-lisa-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx',
        tokens: 'vits-piper-sv_SE-lisa-medium/tokens.txt',
        dataDir: 'vits-piper-sv_SE-lisa-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Liten skog, med många träd';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-sv_SE-lisa-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx',
    tokens: 'vits-piper-sv_SE-lisa-medium/tokens.txt',
    dataDir: 'vits-piper-sv_SE-lisa-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Liten skog, med många träd', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-sv_SE-lisa-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-sv_SE-lisa-medium/tokens.txt",
    dataDir: "vits-piper-sv_SE-lisa-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Liten skog, med många träd"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-sv_SE-lisa-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-sv_SE-lisa-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-sv_SE-lisa-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Liten skog, med många träd";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-sv_SE-lisa-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx",
        tokens = "vits-piper-sv_SE-lisa-medium/tokens.txt",
        dataDir = "vits-piper-sv_SE-lisa-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Liten skog, med många träd",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-sv_SE-lisa-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx");
    vits.setTokens("vits-piper-sv_SE-lisa-medium/tokens.txt");
    vits.setDataDir("vits-piper-sv_SE-lisa-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Liten skog, med många träd";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-sv_SE-lisa-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-sv_SE-lisa-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-sv_SE-lisa-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Liten skog, med många träd', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-sv_SE-lisa-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx",
				Tokens:  "vits-piper-sv_SE-lisa-medium/tokens.txt",
				DataDir: "vits-piper-sv_SE-lisa-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Liten skog, med många träd"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Liten skog, med många träd

sample audios for different speakers are listed below:

Speaker 0

vits-piper-sv_SE-nst-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/sv/sv_SE/nst/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-sv_SE-nst-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-sv_SE-nst-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx";
  config.model.vits.tokens = "vits-piper-sv_SE-nst-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-sv_SE-nst-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Liten skog, med många träd";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-sv_SE-nst-medium.tar.bz2

You can use the following code to play with vits-piper-sv_SE-nst-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx",
            data_dir="vits-piper-sv_SE-nst-medium/espeak-ng-data",
            tokens="vits-piper-sv_SE-nst-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Liten skog, med många träd",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-sv_SE-nst-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx";
  config.model.vits.tokens = "vits-piper-sv_SE-nst-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-sv_SE-nst-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Liten skog, med många träd";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-sv_SE-nst-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx".into()),
                tokens: Some("vits-piper-sv_SE-nst-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-sv_SE-nst-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Liten skog, med många träd";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-sv_SE-nst-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx',
        tokens: 'vits-piper-sv_SE-nst-medium/tokens.txt',
        dataDir: 'vits-piper-sv_SE-nst-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Liten skog, med många träd';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-sv_SE-nst-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx',
    tokens: 'vits-piper-sv_SE-nst-medium/tokens.txt',
    dataDir: 'vits-piper-sv_SE-nst-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Liten skog, med många träd', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-sv_SE-nst-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-sv_SE-nst-medium/tokens.txt",
    dataDir: "vits-piper-sv_SE-nst-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Liten skog, med många träd"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-sv_SE-nst-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-sv_SE-nst-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-sv_SE-nst-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Liten skog, med många träd";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-sv_SE-nst-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx",
        tokens = "vits-piper-sv_SE-nst-medium/tokens.txt",
        dataDir = "vits-piper-sv_SE-nst-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Liten skog, med många träd",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-sv_SE-nst-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx");
    vits.setTokens("vits-piper-sv_SE-nst-medium/tokens.txt");
    vits.setDataDir("vits-piper-sv_SE-nst-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Liten skog, med många träd";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-sv_SE-nst-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-sv_SE-nst-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-sv_SE-nst-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Liten skog, med många träd', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-sv_SE-nst-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx",
				Tokens:  "vits-piper-sv_SE-nst-medium/tokens.txt",
				DataDir: "vits-piper-sv_SE-nst-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Liten skog, med många träd"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Liten skog, med många träd

sample audios for different speakers are listed below:

Speaker 0

supertonic-3-sv

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Swedish (sv).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "sv"

audio = tts.generate("Detta är en text till tal-motor som använder nästa generations kaldi", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Detta är en text till tal-motor som använder nästa generations kaldi";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"sv\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Detta är en text till tal-motor som använder nästa generations kaldi";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "sv"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Detta är en text till tal-motor som använder nästa generations kaldi";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "sv"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Detta är en text till tal-motor som använder nästa generations kaldi';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'sv'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'sv'},
  );
  final audio = tts.generateWithConfig(text: 'Detta är en text till tal-motor som använder nästa generations kaldi', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Detta är en text till tal-motor som använder nästa generations kaldi"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "sv"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Detta är en text till tal-motor som använder nästa generations kaldi";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"sv\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "sv"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Detta är en text till tal-motor som använder nästa generations kaldi",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Detta är en text till tal-motor som använder nästa generations kaldi";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"sv\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "sv"}';

  Audio := Tts.GenerateWithConfig('Detta är en text till tal-motor som använder nästa generations kaldi', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Detta är en text till tal-motor som använder nästa generations kaldi"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "sv"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Hej världen.

1

Hur mår du idag?

2

Himlen är blå och vinden är mild.

3

Maskininlärning hjälper datorer att lära sig av data.

4

Talsyntes omvandlar text till tydligt ljud.

5

Eleverna läste en kort berättelse på biblioteket.

6

Tåget blev försenat på grund av spårunderhåll.

7

Små modeller kör snabbt på lokala enheter.

8

En röstassistent hjälper till med vardagliga uppgifter.

9

Stabil uppläsning är viktig för korta och långa meningar.

Speaker 1

0

Hej världen.

1

Hur mår du idag?

2

Himlen är blå och vinden är mild.

3

Maskininlärning hjälper datorer att lära sig av data.

4

Talsyntes omvandlar text till tydligt ljud.

5

Eleverna läste en kort berättelse på biblioteket.

6

Tåget blev försenat på grund av spårunderhåll.

7

Små modeller kör snabbt på lokala enheter.

8

En röstassistent hjälper till med vardagliga uppgifter.

9

Stabil uppläsning är viktig för korta och långa meningar.

Speaker 2

0

Hej världen.

1

Hur mår du idag?

2

Himlen är blå och vinden är mild.

3

Maskininlärning hjälper datorer att lära sig av data.

4

Talsyntes omvandlar text till tydligt ljud.

5

Eleverna läste en kort berättelse på biblioteket.

6

Tåget blev försenat på grund av spårunderhåll.

7

Små modeller kör snabbt på lokala enheter.

8

En röstassistent hjälper till med vardagliga uppgifter.

9

Stabil uppläsning är viktig för korta och långa meningar.

Speaker 3

0

Hej världen.

1

Hur mår du idag?

2

Himlen är blå och vinden är mild.

3

Maskininlärning hjälper datorer att lära sig av data.

4

Talsyntes omvandlar text till tydligt ljud.

5

Eleverna läste en kort berättelse på biblioteket.

6

Tåget blev försenat på grund av spårunderhåll.

7

Små modeller kör snabbt på lokala enheter.

8

En röstassistent hjälper till med vardagliga uppgifter.

9

Stabil uppläsning är viktig för korta och långa meningar.

Speaker 4

0

Hej världen.

1

Hur mår du idag?

2

Himlen är blå och vinden är mild.

3

Maskininlärning hjälper datorer att lära sig av data.

4

Talsyntes omvandlar text till tydligt ljud.

5

Eleverna läste en kort berättelse på biblioteket.

6

Tåget blev försenat på grund av spårunderhåll.

7

Små modeller kör snabbt på lokala enheter.

8

En röstassistent hjälper till med vardagliga uppgifter.

9

Stabil uppläsning är viktig för korta och långa meningar.

Speaker 5

0

Hej världen.

1

Hur mår du idag?

2

Himlen är blå och vinden är mild.

3

Maskininlärning hjälper datorer att lära sig av data.

4

Talsyntes omvandlar text till tydligt ljud.

5

Eleverna läste en kort berättelse på biblioteket.

6

Tåget blev försenat på grund av spårunderhåll.

7

Små modeller kör snabbt på lokala enheter.

8

En röstassistent hjälper till med vardagliga uppgifter.

9

Stabil uppläsning är viktig för korta och långa meningar.

Speaker 6

0

Hej världen.

1

Hur mår du idag?

2

Himlen är blå och vinden är mild.

3

Maskininlärning hjälper datorer att lära sig av data.

4

Talsyntes omvandlar text till tydligt ljud.

5

Eleverna läste en kort berättelse på biblioteket.

6

Tåget blev försenat på grund av spårunderhåll.

7

Små modeller kör snabbt på lokala enheter.

8

En röstassistent hjälper till med vardagliga uppgifter.

9

Stabil uppläsning är viktig för korta och långa meningar.

Speaker 7

0

Hej världen.

1

Hur mår du idag?

2

Himlen är blå och vinden är mild.

3

Maskininlärning hjälper datorer att lära sig av data.

4

Talsyntes omvandlar text till tydligt ljud.

5

Eleverna läste en kort berättelse på biblioteket.

6

Tåget blev försenat på grund av spårunderhåll.

7

Små modeller kör snabbt på lokala enheter.

8

En röstassistent hjälper till med vardagliga uppgifter.

9

Stabil uppläsning är viktig för korta och långa meningar.

Speaker 8

0

Hej världen.

1

Hur mår du idag?

2

Himlen är blå och vinden är mild.

3

Maskininlärning hjälper datorer att lära sig av data.

4

Talsyntes omvandlar text till tydligt ljud.

5

Eleverna läste en kort berättelse på biblioteket.

6

Tåget blev försenat på grund av spårunderhåll.

7

Små modeller kör snabbt på lokala enheter.

8

En röstassistent hjälper till med vardagliga uppgifter.

9

Stabil uppläsning är viktig för korta och långa meningar.

Speaker 9

0

Hej världen.

1

Hur mår du idag?

2

Himlen är blå och vinden är mild.

3

Maskininlärning hjälper datorer att lära sig av data.

4

Talsyntes omvandlar text till tydligt ljud.

5

Eleverna läste en kort berättelse på biblioteket.

6

Tåget blev försenat på grund av spårunderhåll.

7

Små modeller kör snabbt på lokala enheter.

8

En röstassistent hjälper till med vardagliga uppgifter.

9

Stabil uppläsning är viktig för korta och långa meningar.

Turkish

This section lists text to speech models for Turkish.

vits-piper-tr_TR-dfki-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/tr/tr_TR/dfki/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-tr_TR-dfki-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-tr_TR-dfki-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx";
  config.model.vits.tokens = "vits-piper-tr_TR-dfki-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-tr_TR-dfki-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-tr_TR-dfki-medium.tar.bz2

You can use the following code to play with vits-piper-tr_TR-dfki-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx",
            data_dir="vits-piper-tr_TR-dfki-medium/espeak-ng-data",
            tokens="vits-piper-tr_TR-dfki-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-tr_TR-dfki-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx";
  config.model.vits.tokens = "vits-piper-tr_TR-dfki-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-tr_TR-dfki-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-tr_TR-dfki-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx".into()),
                tokens: Some("vits-piper-tr_TR-dfki-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-tr_TR-dfki-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-tr_TR-dfki-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx',
        tokens: 'vits-piper-tr_TR-dfki-medium/tokens.txt',
        dataDir: 'vits-piper-tr_TR-dfki-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-tr_TR-dfki-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx',
    tokens: 'vits-piper-tr_TR-dfki-medium/tokens.txt',
    dataDir: 'vits-piper-tr_TR-dfki-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-tr_TR-dfki-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-tr_TR-dfki-medium/tokens.txt",
    dataDir: "vits-piper-tr_TR-dfki-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-tr_TR-dfki-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-tr_TR-dfki-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-tr_TR-dfki-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-tr_TR-dfki-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx",
        tokens = "vits-piper-tr_TR-dfki-medium/tokens.txt",
        dataDir = "vits-piper-tr_TR-dfki-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-tr_TR-dfki-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx");
    vits.setTokens("vits-piper-tr_TR-dfki-medium/tokens.txt");
    vits.setDataDir("vits-piper-tr_TR-dfki-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-tr_TR-dfki-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-tr_TR-dfki-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-tr_TR-dfki-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-tr_TR-dfki-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx",
				Tokens:  "vits-piper-tr_TR-dfki-medium/tokens.txt",
				DataDir: "vits-piper-tr_TR-dfki-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz

sample audios for different speakers are listed below:

Speaker 0

supertonic-3-tr

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Turkish (tr).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "tr"

audio = tts.generate("Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"tr\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "tr"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "tr"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'tr'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'tr'},
  );
  final audio = tts.generateWithConfig(text: 'Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "tr"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"tr\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "tr"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"tr\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "tr"}';

  Audio := Tts.GenerateWithConfig('Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "tr"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Merhaba dünya.

1

Bugün nasılsın?

2

Gökyüzü mavi ve rüzgar hafif.

3

Makine öğrenimi bilgisayarların verilerden öğrenmesine yardımcı olur.

4

Konuşma sentezi metni anlaşılır sese dönüştürür.

5

Öğrenciler kütüphanede kısa bir hikaye okudu.

6

Tren ray bakımı nedeniyle gecikti.

7

Küçük modeller yerel cihazlarda hızlı çalışır.

8

Sesli asistan günlük işlerde yardımcı olur.

9

Kararlı okuma kısa ve uzun cümleler için önemlidir.

Speaker 1

0

Merhaba dünya.

1

Bugün nasılsın?

2

Gökyüzü mavi ve rüzgar hafif.

3

Makine öğrenimi bilgisayarların verilerden öğrenmesine yardımcı olur.

4

Konuşma sentezi metni anlaşılır sese dönüştürür.

5

Öğrenciler kütüphanede kısa bir hikaye okudu.

6

Tren ray bakımı nedeniyle gecikti.

7

Küçük modeller yerel cihazlarda hızlı çalışır.

8

Sesli asistan günlük işlerde yardımcı olur.

9

Kararlı okuma kısa ve uzun cümleler için önemlidir.

Speaker 2

0

Merhaba dünya.

1

Bugün nasılsın?

2

Gökyüzü mavi ve rüzgar hafif.

3

Makine öğrenimi bilgisayarların verilerden öğrenmesine yardımcı olur.

4

Konuşma sentezi metni anlaşılır sese dönüştürür.

5

Öğrenciler kütüphanede kısa bir hikaye okudu.

6

Tren ray bakımı nedeniyle gecikti.

7

Küçük modeller yerel cihazlarda hızlı çalışır.

8

Sesli asistan günlük işlerde yardımcı olur.

9

Kararlı okuma kısa ve uzun cümleler için önemlidir.

Speaker 3

0

Merhaba dünya.

1

Bugün nasılsın?

2

Gökyüzü mavi ve rüzgar hafif.

3

Makine öğrenimi bilgisayarların verilerden öğrenmesine yardımcı olur.

4

Konuşma sentezi metni anlaşılır sese dönüştürür.

5

Öğrenciler kütüphanede kısa bir hikaye okudu.

6

Tren ray bakımı nedeniyle gecikti.

7

Küçük modeller yerel cihazlarda hızlı çalışır.

8

Sesli asistan günlük işlerde yardımcı olur.

9

Kararlı okuma kısa ve uzun cümleler için önemlidir.

Speaker 4

0

Merhaba dünya.

1

Bugün nasılsın?

2

Gökyüzü mavi ve rüzgar hafif.

3

Makine öğrenimi bilgisayarların verilerden öğrenmesine yardımcı olur.

4

Konuşma sentezi metni anlaşılır sese dönüştürür.

5

Öğrenciler kütüphanede kısa bir hikaye okudu.

6

Tren ray bakımı nedeniyle gecikti.

7

Küçük modeller yerel cihazlarda hızlı çalışır.

8

Sesli asistan günlük işlerde yardımcı olur.

9

Kararlı okuma kısa ve uzun cümleler için önemlidir.

Speaker 5

0

Merhaba dünya.

1

Bugün nasılsın?

2

Gökyüzü mavi ve rüzgar hafif.

3

Makine öğrenimi bilgisayarların verilerden öğrenmesine yardımcı olur.

4

Konuşma sentezi metni anlaşılır sese dönüştürür.

5

Öğrenciler kütüphanede kısa bir hikaye okudu.

6

Tren ray bakımı nedeniyle gecikti.

7

Küçük modeller yerel cihazlarda hızlı çalışır.

8

Sesli asistan günlük işlerde yardımcı olur.

9

Kararlı okuma kısa ve uzun cümleler için önemlidir.

Speaker 6

0

Merhaba dünya.

1

Bugün nasılsın?

2

Gökyüzü mavi ve rüzgar hafif.

3

Makine öğrenimi bilgisayarların verilerden öğrenmesine yardımcı olur.

4

Konuşma sentezi metni anlaşılır sese dönüştürür.

5

Öğrenciler kütüphanede kısa bir hikaye okudu.

6

Tren ray bakımı nedeniyle gecikti.

7

Küçük modeller yerel cihazlarda hızlı çalışır.

8

Sesli asistan günlük işlerde yardımcı olur.

9

Kararlı okuma kısa ve uzun cümleler için önemlidir.

Speaker 7

0

Merhaba dünya.

1

Bugün nasılsın?

2

Gökyüzü mavi ve rüzgar hafif.

3

Makine öğrenimi bilgisayarların verilerden öğrenmesine yardımcı olur.

4

Konuşma sentezi metni anlaşılır sese dönüştürür.

5

Öğrenciler kütüphanede kısa bir hikaye okudu.

6

Tren ray bakımı nedeniyle gecikti.

7

Küçük modeller yerel cihazlarda hızlı çalışır.

8

Sesli asistan günlük işlerde yardımcı olur.

9

Kararlı okuma kısa ve uzun cümleler için önemlidir.

Speaker 8

0

Merhaba dünya.

1

Bugün nasılsın?

2

Gökyüzü mavi ve rüzgar hafif.

3

Makine öğrenimi bilgisayarların verilerden öğrenmesine yardımcı olur.

4

Konuşma sentezi metni anlaşılır sese dönüştürür.

5

Öğrenciler kütüphanede kısa bir hikaye okudu.

6

Tren ray bakımı nedeniyle gecikti.

7

Küçük modeller yerel cihazlarda hızlı çalışır.

8

Sesli asistan günlük işlerde yardımcı olur.

9

Kararlı okuma kısa ve uzun cümleler için önemlidir.

Speaker 9

0

Merhaba dünya.

1

Bugün nasılsın?

2

Gökyüzü mavi ve rüzgar hafif.

3

Makine öğrenimi bilgisayarların verilerden öğrenmesine yardımcı olur.

4

Konuşma sentezi metni anlaşılır sese dönüştürür.

5

Öğrenciler kütüphanede kısa bir hikaye okudu.

6

Tren ray bakımı nedeniyle gecikti.

7

Küçük modeller yerel cihazlarda hızlı çalışır.

8

Sesli asistan günlük işlerde yardımcı olur.

9

Kararlı okuma kısa ve uzun cümleler için önemlidir.

Ukrainian

This section lists text to speech models for Ukrainian.

vits-piper-uk_UA-lada-x_low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/uk/uk_UA/lada/x_low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-uk_UA-lada-x_low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-uk_UA-lada-x_low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx";
  config.model.vits.tokens = "vits-piper-uk_UA-lada-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-uk_UA-lada-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-uk_UA-lada-x_low.tar.bz2

You can use the following code to play with vits-piper-uk_UA-lada-x_low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx",
            data_dir="vits-piper-uk_UA-lada-x_low/espeak-ng-data",
            tokens="vits-piper-uk_UA-lada-x_low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Ви не можете навчити коня, якщо не відвикнете від годівлі.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-uk_UA-lada-x_low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx";
  config.model.vits.tokens = "vits-piper-uk_UA-lada-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-uk_UA-lada-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-uk_UA-lada-x_low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx".into()),
                tokens: Some("vits-piper-uk_UA-lada-x_low/tokens.txt".into()),
                data_dir: Some("vits-piper-uk_UA-lada-x_low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-uk_UA-lada-x_low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx',
        tokens: 'vits-piper-uk_UA-lada-x_low/tokens.txt',
        dataDir: 'vits-piper-uk_UA-lada-x_low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Ви не можете навчити коня, якщо не відвикнете від годівлі.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-uk_UA-lada-x_low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx',
    tokens: 'vits-piper-uk_UA-lada-x_low/tokens.txt',
    dataDir: 'vits-piper-uk_UA-lada-x_low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Ви не можете навчити коня, якщо не відвикнете від годівлі.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-uk_UA-lada-x_low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx",
    lexicon: "",
    tokens: "vits-piper-uk_UA-lada-x_low/tokens.txt",
    dataDir: "vits-piper-uk_UA-lada-x_low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Ви не можете навчити коня, якщо не відвикнете від годівлі."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-uk_UA-lada-x_low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-uk_UA-lada-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-uk_UA-lada-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-uk_UA-lada-x_low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx",
        tokens = "vits-piper-uk_UA-lada-x_low/tokens.txt",
        dataDir = "vits-piper-uk_UA-lada-x_low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-uk_UA-lada-x_low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx");
    vits.setTokens("vits-piper-uk_UA-lada-x_low/tokens.txt");
    vits.setDataDir("vits-piper-uk_UA-lada-x_low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-uk_UA-lada-x_low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-uk_UA-lada-x_low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-uk_UA-lada-x_low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Ви не можете навчити коня, якщо не відвикнете від годівлі.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-uk_UA-lada-x_low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx",
				Tokens:  "vits-piper-uk_UA-lada-x_low/tokens.txt",
				DataDir: "vits-piper-uk_UA-lada-x_low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Ви не можете навчити коня, якщо не відвикнете від годівлі."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Ви не можете навчити коня, якщо не відвикнете від годівлі.

sample audios for different speakers are listed below:

Speaker 0

vits-piper-uk_UA-ukrainian_tts-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/uk/uk_UA/ukrainian_tts/medium

Number of speakersSample rate
322050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-uk_UA-ukrainian_tts-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx";
  config.model.vits.tokens = "vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-uk_UA-ukrainian_tts-medium.tar.bz2

You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx",
            data_dir="vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data",
            tokens="vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Ви не можете навчити коня, якщо не відвикнете від годівлі.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx";
  config.model.vits.tokens = "vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx".into()),
                tokens: Some("vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx',
        tokens: 'vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt',
        dataDir: 'vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Ви не можете навчити коня, якщо не відвикнете від годівлі.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx',
    tokens: 'vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt',
    dataDir: 'vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Ви не можете навчити коня, якщо не відвикнете від годівлі.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt",
    dataDir: "vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Ви не можете навчити коня, якщо не відвикнете від годівлі."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx",
        tokens = "vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt",
        dataDir = "vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx");
    vits.setTokens("vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt");
    vits.setDataDir("vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Ви не можете навчити коня, якщо не відвикнете від годівлі.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx",
				Tokens:  "vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt",
				DataDir: "vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Ви не можете навчити коня, якщо не відвикнете від годівлі."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Ви не можете навчити коня, якщо не відвикнете від годівлі.

sample audios for different speakers are listed below:

Speaker 0

Speaker 1

Speaker 2

supertonic-3-uk

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Ukrainian (uk).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "uk"

audio = tts.generate("Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"uk\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "uk"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "uk"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'uk'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'uk'},
  );
  final audio = tts.generateWithConfig(text: 'Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "uk"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"uk\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "uk"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"uk\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "uk"}';

  Audio := Tts.GenerateWithConfig('Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "uk"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Привіт світе.

1

Як ти сьогодні?

2

Небо блакитне, а вітер лагідний.

3

Машинне навчання допомагає комп’ютерам вчитися на даних.

4

Синтез мовлення перетворює текст на зрозумілий звук.

5

Учні прочитали коротку історію в бібліотеці.

6

Потяг затримався через ремонт колії.

7

Невеликі моделі швидко працюють на локальних пристроях.

8

Голосовий помічник допомагає з щоденними завданнями.

9

Стабільне читання важливе для коротких і довгих речень.

Speaker 1

0

Привіт світе.

1

Як ти сьогодні?

2

Небо блакитне, а вітер лагідний.

3

Машинне навчання допомагає комп’ютерам вчитися на даних.

4

Синтез мовлення перетворює текст на зрозумілий звук.

5

Учні прочитали коротку історію в бібліотеці.

6

Потяг затримався через ремонт колії.

7

Невеликі моделі швидко працюють на локальних пристроях.

8

Голосовий помічник допомагає з щоденними завданнями.

9

Стабільне читання важливе для коротких і довгих речень.

Speaker 2

0

Привіт світе.

1

Як ти сьогодні?

2

Небо блакитне, а вітер лагідний.

3

Машинне навчання допомагає комп’ютерам вчитися на даних.

4

Синтез мовлення перетворює текст на зрозумілий звук.

5

Учні прочитали коротку історію в бібліотеці.

6

Потяг затримався через ремонт колії.

7

Невеликі моделі швидко працюють на локальних пристроях.

8

Голосовий помічник допомагає з щоденними завданнями.

9

Стабільне читання важливе для коротких і довгих речень.

Speaker 3

0

Привіт світе.

1

Як ти сьогодні?

2

Небо блакитне, а вітер лагідний.

3

Машинне навчання допомагає комп’ютерам вчитися на даних.

4

Синтез мовлення перетворює текст на зрозумілий звук.

5

Учні прочитали коротку історію в бібліотеці.

6

Потяг затримався через ремонт колії.

7

Невеликі моделі швидко працюють на локальних пристроях.

8

Голосовий помічник допомагає з щоденними завданнями.

9

Стабільне читання важливе для коротких і довгих речень.

Speaker 4

0

Привіт світе.

1

Як ти сьогодні?

2

Небо блакитне, а вітер лагідний.

3

Машинне навчання допомагає комп’ютерам вчитися на даних.

4

Синтез мовлення перетворює текст на зрозумілий звук.

5

Учні прочитали коротку історію в бібліотеці.

6

Потяг затримався через ремонт колії.

7

Невеликі моделі швидко працюють на локальних пристроях.

8

Голосовий помічник допомагає з щоденними завданнями.

9

Стабільне читання важливе для коротких і довгих речень.

Speaker 5

0

Привіт світе.

1

Як ти сьогодні?

2

Небо блакитне, а вітер лагідний.

3

Машинне навчання допомагає комп’ютерам вчитися на даних.

4

Синтез мовлення перетворює текст на зрозумілий звук.

5

Учні прочитали коротку історію в бібліотеці.

6

Потяг затримався через ремонт колії.

7

Невеликі моделі швидко працюють на локальних пристроях.

8

Голосовий помічник допомагає з щоденними завданнями.

9

Стабільне читання важливе для коротких і довгих речень.

Speaker 6

0

Привіт світе.

1

Як ти сьогодні?

2

Небо блакитне, а вітер лагідний.

3

Машинне навчання допомагає комп’ютерам вчитися на даних.

4

Синтез мовлення перетворює текст на зрозумілий звук.

5

Учні прочитали коротку історію в бібліотеці.

6

Потяг затримався через ремонт колії.

7

Невеликі моделі швидко працюють на локальних пристроях.

8

Голосовий помічник допомагає з щоденними завданнями.

9

Стабільне читання важливе для коротких і довгих речень.

Speaker 7

0

Привіт світе.

1

Як ти сьогодні?

2

Небо блакитне, а вітер лагідний.

3

Машинне навчання допомагає комп’ютерам вчитися на даних.

4

Синтез мовлення перетворює текст на зрозумілий звук.

5

Учні прочитали коротку історію в бібліотеці.

6

Потяг затримався через ремонт колії.

7

Невеликі моделі швидко працюють на локальних пристроях.

8

Голосовий помічник допомагає з щоденними завданнями.

9

Стабільне читання важливе для коротких і довгих речень.

Speaker 8

0

Привіт світе.

1

Як ти сьогодні?

2

Небо блакитне, а вітер лагідний.

3

Машинне навчання допомагає комп’ютерам вчитися на даних.

4

Синтез мовлення перетворює текст на зрозумілий звук.

5

Учні прочитали коротку історію в бібліотеці.

6

Потяг затримався через ремонт колії.

7

Невеликі моделі швидко працюють на локальних пристроях.

8

Голосовий помічник допомагає з щоденними завданнями.

9

Стабільне читання важливе для коротких і довгих речень.

Speaker 9

0

Привіт світе.

1

Як ти сьогодні?

2

Небо блакитне, а вітер лагідний.

3

Машинне навчання допомагає комп’ютерам вчитися на даних.

4

Синтез мовлення перетворює текст на зрозумілий звук.

5

Учні прочитали коротку історію в бібліотеці.

6

Потяг затримався через ремонт колії.

7

Невеликі моделі швидко працюють на локальних пристроях.

8

Голосовий помічник допомагає з щоденними завданнями.

9

Стабільне читання важливе для коротких і довгих речень.

Urdu

This section lists text to speech models for Urdu.

vits-piper-ur_PK-fasih-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ur/ur_PK/fasih/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ur_PK-fasih-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-ur_PK-fasih-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx";
  config.model.vits.tokens = "vits-piper-ur_PK-fasih-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ur_PK-fasih-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-ur_PK-fasih-medium.tar.bz2

You can use the following code to play with vits-piper-ur_PK-fasih-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx",
            data_dir="vits-piper-ur_PK-fasih-medium/espeak-ng-data",
            tokens="vits-piper-ur_PK-fasih-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-ur_PK-fasih-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx";
  config.model.vits.tokens = "vits-piper-ur_PK-fasih-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-ur_PK-fasih-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-ur_PK-fasih-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx".into()),
                tokens: Some("vits-piper-ur_PK-fasih-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-ur_PK-fasih-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-ur_PK-fasih-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx',
        tokens: 'vits-piper-ur_PK-fasih-medium/tokens.txt',
        dataDir: 'vits-piper-ur_PK-fasih-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-ur_PK-fasih-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx',
    tokens: 'vits-piper-ur_PK-fasih-medium/tokens.txt',
    dataDir: 'vits-piper-ur_PK-fasih-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-ur_PK-fasih-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-ur_PK-fasih-medium/tokens.txt",
    dataDir: "vits-piper-ur_PK-fasih-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-ur_PK-fasih-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ur_PK-fasih-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ur_PK-fasih-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-ur_PK-fasih-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx",
        tokens = "vits-piper-ur_PK-fasih-medium/tokens.txt",
        dataDir = "vits-piper-ur_PK-fasih-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-ur_PK-fasih-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx");
    vits.setTokens("vits-piper-ur_PK-fasih-medium/tokens.txt");
    vits.setDataDir("vits-piper-ur_PK-fasih-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-ur_PK-fasih-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-ur_PK-fasih-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-ur_PK-fasih-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-ur_PK-fasih-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx",
				Tokens:  "vits-piper-ur_PK-fasih-medium/tokens.txt",
				DataDir: "vits-piper-ur_PK-fasih-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔

sample audios for different speakers are listed below:

Speaker 0

Vietnamese

This section lists text to speech models for Vietnamese.

vits-piper-vi_VN-25hours_single-low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/vi/vi_VN/25hours_single/low

Number of speakersSample rate
116000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-vi_VN-25hours_single-low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-vi_VN-25hours_single-low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx";
  config.model.vits.tokens = "vits-piper-vi_VN-25hours_single-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-vi_VN-25hours_single-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-vi_VN-25hours_single-low.tar.bz2

You can use the following code to play with vits-piper-vi_VN-25hours_single-low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx",
            data_dir="vits-piper-vi_VN-25hours_single-low/espeak-ng-data",
            tokens="vits-piper-vi_VN-25hours_single-low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nước cũ đào gỗ mới, sông cũ chảy nước mới",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-vi_VN-25hours_single-low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx";
  config.model.vits.tokens = "vits-piper-vi_VN-25hours_single-low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-vi_VN-25hours_single-low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-vi_VN-25hours_single-low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx".into()),
                tokens: Some("vits-piper-vi_VN-25hours_single-low/tokens.txt".into()),
                data_dir: Some("vits-piper-vi_VN-25hours_single-low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-vi_VN-25hours_single-low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx',
        tokens: 'vits-piper-vi_VN-25hours_single-low/tokens.txt',
        dataDir: 'vits-piper-vi_VN-25hours_single-low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Nước cũ đào gỗ mới, sông cũ chảy nước mới';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-vi_VN-25hours_single-low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx',
    tokens: 'vits-piper-vi_VN-25hours_single-low/tokens.txt',
    dataDir: 'vits-piper-vi_VN-25hours_single-low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Nước cũ đào gỗ mới, sông cũ chảy nước mới', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-vi_VN-25hours_single-low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx",
    lexicon: "",
    tokens: "vits-piper-vi_VN-25hours_single-low/tokens.txt",
    dataDir: "vits-piper-vi_VN-25hours_single-low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-vi_VN-25hours_single-low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx";
config.Model.Vits.Tokens = "vits-piper-vi_VN-25hours_single-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-vi_VN-25hours_single-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-vi_VN-25hours_single-low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx",
        tokens = "vits-piper-vi_VN-25hours_single-low/tokens.txt",
        dataDir = "vits-piper-vi_VN-25hours_single-low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-vi_VN-25hours_single-low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx");
    vits.setTokens("vits-piper-vi_VN-25hours_single-low/tokens.txt");
    vits.setDataDir("vits-piper-vi_VN-25hours_single-low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-vi_VN-25hours_single-low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-vi_VN-25hours_single-low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-vi_VN-25hours_single-low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Nước cũ đào gỗ mới, sông cũ chảy nước mới', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-vi_VN-25hours_single-low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx",
				Tokens:  "vits-piper-vi_VN-25hours_single-low/tokens.txt",
				DataDir: "vits-piper-vi_VN-25hours_single-low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Nước cũ đào gỗ mới, sông cũ chảy nước mới"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Nước cũ đào gỗ mới, sông cũ chảy nước mới

sample audios for different speakers are listed below:

Speaker 0

vits-piper-vi_VN-vais1000-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/vi/vi_VN/vais1000/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-vi_VN-vais1000-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-vi_VN-vais1000-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx";
  config.model.vits.tokens = "vits-piper-vi_VN-vais1000-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-vi_VN-vais1000-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-vi_VN-vais1000-medium.tar.bz2

You can use the following code to play with vits-piper-vi_VN-vais1000-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx",
            data_dir="vits-piper-vi_VN-vais1000-medium/espeak-ng-data",
            tokens="vits-piper-vi_VN-vais1000-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nước cũ đào gỗ mới, sông cũ chảy nước mới",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-vi_VN-vais1000-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx";
  config.model.vits.tokens = "vits-piper-vi_VN-vais1000-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-vi_VN-vais1000-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-vi_VN-vais1000-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx".into()),
                tokens: Some("vits-piper-vi_VN-vais1000-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-vi_VN-vais1000-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-vi_VN-vais1000-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx',
        tokens: 'vits-piper-vi_VN-vais1000-medium/tokens.txt',
        dataDir: 'vits-piper-vi_VN-vais1000-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Nước cũ đào gỗ mới, sông cũ chảy nước mới';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-vi_VN-vais1000-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx',
    tokens: 'vits-piper-vi_VN-vais1000-medium/tokens.txt',
    dataDir: 'vits-piper-vi_VN-vais1000-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Nước cũ đào gỗ mới, sông cũ chảy nước mới', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-vi_VN-vais1000-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-vi_VN-vais1000-medium/tokens.txt",
    dataDir: "vits-piper-vi_VN-vais1000-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-vi_VN-vais1000-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-vi_VN-vais1000-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-vi_VN-vais1000-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-vi_VN-vais1000-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx",
        tokens = "vits-piper-vi_VN-vais1000-medium/tokens.txt",
        dataDir = "vits-piper-vi_VN-vais1000-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-vi_VN-vais1000-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx");
    vits.setTokens("vits-piper-vi_VN-vais1000-medium/tokens.txt");
    vits.setDataDir("vits-piper-vi_VN-vais1000-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-vi_VN-vais1000-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-vi_VN-vais1000-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-vi_VN-vais1000-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Nước cũ đào gỗ mới, sông cũ chảy nước mới', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-vi_VN-vais1000-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx",
				Tokens:  "vits-piper-vi_VN-vais1000-medium/tokens.txt",
				DataDir: "vits-piper-vi_VN-vais1000-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Nước cũ đào gỗ mới, sông cũ chảy nước mới"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Nước cũ đào gỗ mới, sông cũ chảy nước mới

sample audios for different speakers are listed below:

Speaker 0

vits-piper-vi_VN-vivos-x_low

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/vi/vi_VN/vivos/x_low

Number of speakersSample rate
6516000

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-vi_VN-vivos-x_low.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-vi_VN-vivos-x_low with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx";
  config.model.vits.tokens = "vits-piper-vi_VN-vivos-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-vi_VN-vivos-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-vi_VN-vivos-x_low.tar.bz2

You can use the following code to play with vits-piper-vi_VN-vivos-x_low

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx",
            data_dir="vits-piper-vi_VN-vivos-x_low/espeak-ng-data",
            tokens="vits-piper-vi_VN-vivos-x_low/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nước cũ đào gỗ mới, sông cũ chảy nước mới",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-vi_VN-vivos-x_low with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx";
  config.model.vits.tokens = "vits-piper-vi_VN-vivos-x_low/tokens.txt";
  config.model.vits.data_dir = "vits-piper-vi_VN-vivos-x_low/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-vi_VN-vivos-x_low with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx".into()),
                tokens: Some("vits-piper-vi_VN-vivos-x_low/tokens.txt".into()),
                data_dir: Some("vits-piper-vi_VN-vivos-x_low/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-vi_VN-vivos-x_low with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx',
        tokens: 'vits-piper-vi_VN-vivos-x_low/tokens.txt',
        dataDir: 'vits-piper-vi_VN-vivos-x_low/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Nước cũ đào gỗ mới, sông cũ chảy nước mới';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-vi_VN-vivos-x_low with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx',
    tokens: 'vits-piper-vi_VN-vivos-x_low/tokens.txt',
    dataDir: 'vits-piper-vi_VN-vivos-x_low/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Nước cũ đào gỗ mới, sông cũ chảy nước mới', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-vi_VN-vivos-x_low with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx",
    lexicon: "",
    tokens: "vits-piper-vi_VN-vivos-x_low/tokens.txt",
    dataDir: "vits-piper-vi_VN-vivos-x_low/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-vi_VN-vivos-x_low with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-vi_VN-vivos-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-vi_VN-vivos-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-vi_VN-vivos-x_low with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx",
        tokens = "vits-piper-vi_VN-vivos-x_low/tokens.txt",
        dataDir = "vits-piper-vi_VN-vivos-x_low/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-vi_VN-vivos-x_low with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx");
    vits.setTokens("vits-piper-vi_VN-vivos-x_low/tokens.txt");
    vits.setDataDir("vits-piper-vi_VN-vivos-x_low/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-vi_VN-vivos-x_low with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-vi_VN-vivos-x_low/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-vi_VN-vivos-x_low/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Nước cũ đào gỗ mới, sông cũ chảy nước mới', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-vi_VN-vivos-x_low with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx",
				Tokens:  "vits-piper-vi_VN-vivos-x_low/tokens.txt",
				DataDir: "vits-piper-vi_VN-vivos-x_low/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Nước cũ đào gỗ mới, sông cũ chảy nước mới"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Nước cũ đào gỗ mới, sông cũ chảy nước mới

sample audios for different speakers are listed below:

Speaker 0

Speaker 1

Speaker 2

Speaker 3

Speaker 4

Speaker 5

Speaker 6

Speaker 7

Speaker 8

Speaker 9

Speaker 10

Speaker 11

Speaker 12

Speaker 13

Speaker 14

Speaker 15

Speaker 16

Speaker 17

Speaker 18

Speaker 19

Speaker 20

Speaker 21

Speaker 22

Speaker 23

Speaker 24

Speaker 25

Speaker 26

Speaker 27

Speaker 28

Speaker 29

Speaker 30

Speaker 31

Speaker 32

Speaker 33

Speaker 34

Speaker 35

Speaker 36

Speaker 37

Speaker 38

Speaker 39

Speaker 40

Speaker 41

Speaker 42

Speaker 43

Speaker 44

Speaker 45

Speaker 46

Speaker 47

Speaker 48

Speaker 49

Speaker 50

Speaker 51

Speaker 52

Speaker 53

Speaker 54

Speaker 55

Speaker 56

Speaker 57

Speaker 58

Speaker 59

Speaker 60

Speaker 61

Speaker 62

Speaker 63

Speaker 64

supertonic-3-vi

Info about this model

This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3

It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.

This page shows samples for Vietnamese (vi).

Number of speakersSample rate
1024000

Speaker IDs

sid0123456789

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-3-tts-int8-2026-05-11.tar.bz2

You can use the following code to play with supertonic-3

import sherpa_onnx
import soundfile as sf

tts_config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
            duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
            text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
            vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
            vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
            tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
            unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
            voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
        ),
        debug=False,
        num_threads=2,
        provider="cpu",
    ),
)

tts = sherpa_onnx.OfflineTts(tts_config)

gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "vi"

audio = tts.generate("Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo", gen_config)

sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with supertonic-3 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  const char *text = "Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0;
  gen_cfg.extra = "{\"lang\": \"vi\"}";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with supertonic-3 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
  config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
  config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
  config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
  config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
  config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
  config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

  config.model.num_threads = 2;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 1;

  std::string filename = "./test.wav";
  std::string text = "Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.num_steps = 8;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed
  gen_cfg.extra = R"({"lang": "vi"})";

  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-supertonic.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-supertonic \
  /tmp/test-supertonic.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-supertonic.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with supertonic-3 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            supertonic: OfflineTtsSupertonicModelConfig {
                duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
                text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
                vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
                vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
                tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
                unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
                voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo";

    let gen_config = GenerationConfig {
        sid: 0,
        num_steps: 8,
        speed: 1.0,
        extra: Some(r#"{"lang": "vi"}"#.into()),
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with supertonic-3 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      supertonic: {
        durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
        textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
        vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
        vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
        ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
        unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
        voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
      },
      debug: true,
      numThreads: 2,
      provider: 'cpu',
    },
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  numSteps: 8,
  speed: 1.0,
  extra: {lang: 'vi'},
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with supertonic-3 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
    durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
    textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
    vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
    vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
    ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
    unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
    voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    supertonic: supertonic,
    numThreads: 2,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    numSteps: 8,
    speed: 1.0,
    extra: {'lang': 'vi'},
  );
  final audio = tts.generateWithConfig(text: 'Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with supertonic-3 with Swift API.

func run() {
  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
    durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
    textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
    vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
    vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
    ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
    unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
    voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.numSteps = 8
  genConfig.speed = 1.0
  genConfig.extra = ["lang": "vi"]

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with supertonic-3 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";

config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";

var tts = new OfflineTts(config);
var text = "Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"vi\"}";

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with supertonic-3 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      supertonic = OfflineTtsSupertonicModelConfig(
        durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
        textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
        vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
        vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
        ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
        unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
        voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
      ),
      numThreads = 2,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    numSteps = 8,
    speed = 1.0f,
    extra = mapOf("lang" to "vi"),
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with supertonic-3 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var supertonic = new OfflineTtsSupertonicModelConfig();
    supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
    supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
    supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
    supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
    supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
    supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
    supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setSupertonic(supertonic);
    modelConfig.setNumThreads(2);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);

    var tts = new OfflineTts(config);
    var text = "Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setNumSteps(8);
    genConfig.setSpeed(1.0f);
    genConfig.setExtra("{\"lang\": \"vi\"}");

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with supertonic-3 with Pascal API.

program test_supertonic;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
  Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
  Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
  Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
  Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
  Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
  Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';

  Config.Model.NumThreads := 2;
  Config.Model.Debug := True;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.NumSteps := 8;
  GenConfig.Speed := 1.0;
  GenConfig.Extra := '{"lang": "vi"}';

  Audio := Tts.GenerateWithConfig('Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with supertonic-3 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
				DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
				TextEncoder:       "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
				VectorEstimator:   "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
				Vocoder:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
				TtsJson:           "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
				UnicodeIndexer:    "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
				VoiceStyle:        "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
			},
			NumThreads: 2,
			Debug:      true,
		},
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo"

	genConfig := sherpa.GenerationConfig{
		Sid:       0,
		NumSteps:  8,
		Speed:     1.0,
		Extra:     `{"lang": "vi"}`,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

sample audios for different speakers are listed below:

Speaker 0

0

Xin chào thế giới.

1

Hôm nay bạn thế nào?

2

Bầu trời xanh và gió rất nhẹ.

3

Học máy giúp máy tính học từ dữ liệu.

4

Tổng hợp giọng nói chuyển văn bản thành âm thanh rõ ràng.

5

Học sinh đọc một câu chuyện ngắn trong thư viện.

6

Tàu bị trễ vì công việc bảo trì đường ray.

7

Các mô hình nhỏ chạy nhanh trên thiết bị cục bộ.

8

Trợ lý giọng nói hỗ trợ các công việc hằng ngày.

9

Việc đọc ổn định rất quan trọng cho câu ngắn và câu dài.

Speaker 1

0

Xin chào thế giới.

1

Hôm nay bạn thế nào?

2

Bầu trời xanh và gió rất nhẹ.

3

Học máy giúp máy tính học từ dữ liệu.

4

Tổng hợp giọng nói chuyển văn bản thành âm thanh rõ ràng.

5

Học sinh đọc một câu chuyện ngắn trong thư viện.

6

Tàu bị trễ vì công việc bảo trì đường ray.

7

Các mô hình nhỏ chạy nhanh trên thiết bị cục bộ.

8

Trợ lý giọng nói hỗ trợ các công việc hằng ngày.

9

Việc đọc ổn định rất quan trọng cho câu ngắn và câu dài.

Speaker 2

0

Xin chào thế giới.

1

Hôm nay bạn thế nào?

2

Bầu trời xanh và gió rất nhẹ.

3

Học máy giúp máy tính học từ dữ liệu.

4

Tổng hợp giọng nói chuyển văn bản thành âm thanh rõ ràng.

5

Học sinh đọc một câu chuyện ngắn trong thư viện.

6

Tàu bị trễ vì công việc bảo trì đường ray.

7

Các mô hình nhỏ chạy nhanh trên thiết bị cục bộ.

8

Trợ lý giọng nói hỗ trợ các công việc hằng ngày.

9

Việc đọc ổn định rất quan trọng cho câu ngắn và câu dài.

Speaker 3

0

Xin chào thế giới.

1

Hôm nay bạn thế nào?

2

Bầu trời xanh và gió rất nhẹ.

3

Học máy giúp máy tính học từ dữ liệu.

4

Tổng hợp giọng nói chuyển văn bản thành âm thanh rõ ràng.

5

Học sinh đọc một câu chuyện ngắn trong thư viện.

6

Tàu bị trễ vì công việc bảo trì đường ray.

7

Các mô hình nhỏ chạy nhanh trên thiết bị cục bộ.

8

Trợ lý giọng nói hỗ trợ các công việc hằng ngày.

9

Việc đọc ổn định rất quan trọng cho câu ngắn và câu dài.

Speaker 4

0

Xin chào thế giới.

1

Hôm nay bạn thế nào?

2

Bầu trời xanh và gió rất nhẹ.

3

Học máy giúp máy tính học từ dữ liệu.

4

Tổng hợp giọng nói chuyển văn bản thành âm thanh rõ ràng.

5

Học sinh đọc một câu chuyện ngắn trong thư viện.

6

Tàu bị trễ vì công việc bảo trì đường ray.

7

Các mô hình nhỏ chạy nhanh trên thiết bị cục bộ.

8

Trợ lý giọng nói hỗ trợ các công việc hằng ngày.

9

Việc đọc ổn định rất quan trọng cho câu ngắn và câu dài.

Speaker 5

0

Xin chào thế giới.

1

Hôm nay bạn thế nào?

2

Bầu trời xanh và gió rất nhẹ.

3

Học máy giúp máy tính học từ dữ liệu.

4

Tổng hợp giọng nói chuyển văn bản thành âm thanh rõ ràng.

5

Học sinh đọc một câu chuyện ngắn trong thư viện.

6

Tàu bị trễ vì công việc bảo trì đường ray.

7

Các mô hình nhỏ chạy nhanh trên thiết bị cục bộ.

8

Trợ lý giọng nói hỗ trợ các công việc hằng ngày.

9

Việc đọc ổn định rất quan trọng cho câu ngắn và câu dài.

Speaker 6

0

Xin chào thế giới.

1

Hôm nay bạn thế nào?

2

Bầu trời xanh và gió rất nhẹ.

3

Học máy giúp máy tính học từ dữ liệu.

4

Tổng hợp giọng nói chuyển văn bản thành âm thanh rõ ràng.

5

Học sinh đọc một câu chuyện ngắn trong thư viện.

6

Tàu bị trễ vì công việc bảo trì đường ray.

7

Các mô hình nhỏ chạy nhanh trên thiết bị cục bộ.

8

Trợ lý giọng nói hỗ trợ các công việc hằng ngày.

9

Việc đọc ổn định rất quan trọng cho câu ngắn và câu dài.

Speaker 7

0

Xin chào thế giới.

1

Hôm nay bạn thế nào?

2

Bầu trời xanh và gió rất nhẹ.

3

Học máy giúp máy tính học từ dữ liệu.

4

Tổng hợp giọng nói chuyển văn bản thành âm thanh rõ ràng.

5

Học sinh đọc một câu chuyện ngắn trong thư viện.

6

Tàu bị trễ vì công việc bảo trì đường ray.

7

Các mô hình nhỏ chạy nhanh trên thiết bị cục bộ.

8

Trợ lý giọng nói hỗ trợ các công việc hằng ngày.

9

Việc đọc ổn định rất quan trọng cho câu ngắn và câu dài.

Speaker 8

0

Xin chào thế giới.

1

Hôm nay bạn thế nào?

2

Bầu trời xanh và gió rất nhẹ.

3

Học máy giúp máy tính học từ dữ liệu.

4

Tổng hợp giọng nói chuyển văn bản thành âm thanh rõ ràng.

5

Học sinh đọc một câu chuyện ngắn trong thư viện.

6

Tàu bị trễ vì công việc bảo trì đường ray.

7

Các mô hình nhỏ chạy nhanh trên thiết bị cục bộ.

8

Trợ lý giọng nói hỗ trợ các công việc hằng ngày.

9

Việc đọc ổn định rất quan trọng cho câu ngắn và câu dài.

Speaker 9

0

Xin chào thế giới.

1

Hôm nay bạn thế nào?

2

Bầu trời xanh và gió rất nhẹ.

3

Học máy giúp máy tính học từ dữ liệu.

4

Tổng hợp giọng nói chuyển văn bản thành âm thanh rõ ràng.

5

Học sinh đọc một câu chuyện ngắn trong thư viện.

6

Tàu bị trễ vì công việc bảo trì đường ray.

7

Các mô hình nhỏ chạy nhanh trên thiết bị cục bộ.

8

Trợ lý giọng nói hỗ trợ các công việc hằng ngày.

9

Việc đọc ổn định rất quan trọng cho câu ngắn và câu dài.

Welsh

This section lists text to speech models for Welsh.

vits-piper-cy_GB-bu_tts-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/cy/cy_GB/bu_tts/medium

Number of speakersSample rate
722050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-cy_GB-bu_tts-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx";
  config.model.vits.tokens = "vits-piper-cy_GB-bu_tts-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-cy_GB-bu_tts-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-cy_GB-bu_tts-medium.tar.bz2

You can use the following code to play with vits-piper-cy_GB-bu_tts-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx",
            data_dir="vits-piper-cy_GB-bu_tts-medium/espeak-ng-data",
            tokens="vits-piper-cy_GB-bu_tts-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx";
  config.model.vits.tokens = "vits-piper-cy_GB-bu_tts-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-cy_GB-bu_tts-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx".into()),
                tokens: Some("vits-piper-cy_GB-bu_tts-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-cy_GB-bu_tts-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx',
        tokens: 'vits-piper-cy_GB-bu_tts-medium/tokens.txt',
        dataDir: 'vits-piper-cy_GB-bu_tts-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx',
    tokens: 'vits-piper-cy_GB-bu_tts-medium/tokens.txt',
    dataDir: 'vits-piper-cy_GB-bu_tts-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-cy_GB-bu_tts-medium/tokens.txt",
    dataDir: "vits-piper-cy_GB-bu_tts-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-cy_GB-bu_tts-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-cy_GB-bu_tts-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx",
        tokens = "vits-piper-cy_GB-bu_tts-medium/tokens.txt",
        dataDir = "vits-piper-cy_GB-bu_tts-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx");
    vits.setTokens("vits-piper-cy_GB-bu_tts-medium/tokens.txt");
    vits.setDataDir("vits-piper-cy_GB-bu_tts-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-cy_GB-bu_tts-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-cy_GB-bu_tts-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx",
				Tokens:  "vits-piper-cy_GB-bu_tts-medium/tokens.txt",
				DataDir: "vits-piper-cy_GB-bu_tts-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd

sample audios for different speakers are listed below:

Speaker 0

Speaker 1

Speaker 2

Speaker 3

Speaker 4

Speaker 5

Speaker 6

vits-piper-cy_GB-gwryw_gogleddol-medium

Info about this model

This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/cy/cy_GB/gwryw_gogleddol/medium

Number of speakersSample rate
122050

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-cy_GB-gwryw_gogleddol-medium.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

C API

Click to expand

You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.vits.model = "vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx";
  config.model.vits.tokens = "vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  const char *text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd";

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-cy_GB-gwryw_gogleddol-medium.tar.bz2

You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        vits=sherpa_onnx.OfflineTtsVitsModelConfig(
            model="vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx",
            data_dir="vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data",
            tokens="vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C++ API

Click to expand

You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;
  config.model.vits.model = "vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx";
  config.model.vits.tokens = "vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt";
  config.model.vits.data_dir = "vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data";
  config.model.num_threads = 1;

  // If you want to see debug messages, please set it to 1
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}

In the following, we describe how to compile and run the above C++ example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-piper.cc. Then you can compile it with the following command:

g++   -std=c++17   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-cxx-api   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-piper   /tmp/test-piper.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-piper

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-piper.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            vits: OfflineTtsVitsModelConfig {
                model: Some("vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx".into()),
                tokens: Some("vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt".into()),
                data_dir: Some("vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      vits: {
        model: 'vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx',
        tokens: 'vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt',
        dataDir: 'vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
    model: 'vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx',
    tokens: 'vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt',
    dataDir: 'vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    vits: vits,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with Swift API.

func run() {
  let vits = sherpaOnnxOfflineTtsVitsModelConfig(
    model: "vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx",
    lexicon: "",
    tokens: "vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt",
    dataDir: "vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd"
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      vits = OfflineTtsVitsModelConfig(
        model = "vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx",
        tokens = "vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt",
        dataDir = "vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = 0,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var vits = new OfflineTtsVitsModelConfig();
    vits.setModel("vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx");
    vits.setTokens("vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt");
    vits.setDataDir("vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setVits(vits);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with Pascal API.

program test_piper;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Vits.Model := 'vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx';
  Config.Model.Vits.Tokens := 'vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt';
  Config.Model.Vits.DataDir := 'vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Vits: sherpa.OfflineTtsVitsModelConfig{
				Model:   "vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx",
				Tokens:  "vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt",
				DataDir: "vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd"

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd

sample audios for different speakers are listed below:

Speaker 0